Monday, September 20, 2010
轻轻松松文本文档txt转换excel
方法一:
1.打开空白EXCEL工作簿,使用“打开”命令
选择菜单--文件--打开--在“打开”对话框中选择“文件类型”为“文本文件”,如图3所示,点击选中的文本文件--打开。
2。按“文本导入向导”对话框进行操作,共三步。
第1步。 只要按提示操作,下一步,下一步至完成
第2步点击Tab键和逗号框,如TXT文本无逗号,只点击TAB键一项。
第3步。根据提示,单击“完成”按钮即可。
下例E表格即为标准的EXCEL表格。
方法二:
打开空白EXCEL工作簿,使用“数据”命令--导入外部数据--导入数据--选取数据源.
下面执行步聚重复1.2.同样也能轻轻松松地将TXT转换成EXCEL。
Tuesday, June 22, 2010
Wednesday, May 26, 2010
数据挖掘算法之-关联规则挖掘
数据挖掘算法之-关联规则挖掘(Association Rule)(2009-09-20 21:59:23)转载标签:dm 分类:DM
在数据挖掘的知识模式中,关联规则模式是比较重要的一种。关联规则的概念由Agrawal、Imielinski、Swami 提出,是数据中一种简单但很实用的规则。关联规则模式属于描述型模式,发现关联规则的算法属于无监督学习的方法。
一、关联规则的定义和属性
考察一些涉及许多物品的事务:事务1 中出现了物品甲,事务2 中出现了物品乙,事务3 中则同时出现了物品甲和乙。那么,物品甲和乙在事务中的出现相互之间是否有规律可循呢?在数据库的知识发现中,关联规则就是描述这种在一个事务中物品之间同时出现的规律的知识模式。更确切的说,关联规则通过量化的数字描述物品甲的出现对物品乙的出现有多大的影响。
现实中,这样的例子很多。例如超级市场利用前端收款机收集存储了大量的售货数据,这些数据是一条条的购买事务记录,每条记录存储了事务处理时间,顾客购买的物品、物品的数量及金额等。这些数据中常常隐含形式如下的关联规则:在购买铁锤的顾客当中,有70 %的人同时购买了铁钉。这些关联规则很有价值,商场管理人员可以根据这些关联规则更好地规划商场,如把铁锤和铁钉这样的商品摆放在一起,能够促进销售。
有些数据不像售货数据那样很容易就能看出一个事务是许多物品的集合,但稍微转换一下思考角度,仍然可以像售货数据一样处理。比如人寿保险,一份保单就是一个事务。保险公司在接受保险前,往往需要记录投保人详尽的信息,有时还要到医院做身体检查。保单上记录有投保人的年龄、性别、健康状况、工作单位、工作地址、工资水平等。这些投保人的个人信息就可以看作事务中的物品。通过分析这些数据,可以得到类似以下这样的关联规则:年龄在40 岁以上,工作在A 区的投保人当中,有45 %的人曾经向保险公司索赔过。在这条规则中,“年龄在40 岁以上”是物品甲,“工作在A 区”是物品乙,“向保险公司索赔过”则是物品丙。可以看出来,A 区可能污染比较严重,环境比较差,导致工作在该区的人健康状况不好,索赔率也相对比较高。
设R= { I1,I2 ......Im} 是一组物品集,W 是一组事务集。W 中的每个事务T 是一组物品,T R。假设有一个物品集A,一个事务T,如果A T,则称事务T 支持物品集A。关联规则是如下形式的一种蕴含:A→B,其中A、B 是两组物品,A I,B I,且A ∩B=。一般用四个参数来描述一个关联规则的属性:
1 .可信度(Confidence)
设W 中支持物品集A 的事务中,有c %的事务同时也支持物品集B,c %称为关联规则A→B 的可信度。简单地说,可信度就是指在出现了物品集A 的事务T 中,物品集B 也同时出现的概率有多大。如上面所举的铁锤和铁钉的例子,该关联规则的可信度就回答了这样一个问题:如果一个顾客购买了铁锤,那么他也购买铁钉的可能性有多大呢?在上述例子中,购买铁锤的顾客中有70 %的人购买了铁钉, 所以可信度是70 %。
2 .支持度(Support)
设W 中有s %的事务同时支持物品集A 和B,s %称为关联规则A→B 的支持度。支持度描述了A 和B 这两个物品集的并集C 在所有的事务中出现的概率有多大。如果某天共有1000 个顾客到商场购买物品,其中有100 个顾客同时购买了铁锤和铁钉,那么上述的关联规则的支持度就是10 %。
3 .期望可信度(Expected confidence)
设W 中有e %的事务支持物品集B,e %称为关联规则A→B 的期望可信度度。期望可信度描述了在没有任何条件影响时,物品集B 在所有事务中出现的概率有多大。如果某天共有1000 个顾客到商场购买物品,其中有200 个顾客购买了铁钉,则上述的关联规则的期望可信度就是20 %。
4 .作用度(Lift)
作用度是可信度与期望可信度的比值。作用度描述物品集A 的出现对物品集B 的出现有多大的影响。因为物品集B 在所有事务中出现的概率是期望可信度;而物品集B 在有物品集A 出现的事务中出现的概率是可信度,通过可信度对期望可信度的比值反映了在加入“物品集A 出现”的这个条件后,物品集B 的出现概率发生了多大的变化。在上例中作用度就是70 %/20 %=3.5。
可信度是对关联规则的准确度的衡量,支持度是对关联规则重要性的衡量。支持度说明了这条规则在所有事务中有多大的代表性,显然支持度越大,关联规则越重要。有些关联规则可信度虽然很高,但支持度却很低,说明该关联规则实用的机会很小,因此也不重要。
期望可信度描述了在没有物品集A 的作用下,物品集B 本身的支持度;作用度描述了物品集A 对物品集B 的影响力的大小。作用度越大,说明物品集B 受物品集A 的影响越大。一般情况,有用的关联规则的作用度都应该大于1,只有关联规则的可信度大于期望可信度,才说明A 的出现对B 的出现有促进作用,也说明了它们之间某种程度的相关性,如果作用度不大于1,则此关联规则也就没有意义了。
二、关联规则的挖掘
在关联规则的四个属性中,支持度和可信度能够比较直接形容关联规则的性质。从关联规则定义可以看出,任意给出事务中的两个物品集,它们之间都存在关联规则,只不过属性值有所不同。如果不考虑关联规则的支持度和可信度,那么在事务数据库中可以发现无穷多的关联规则。事实上,人们一般只对满足一定的支持度和可信度的关联规则感兴趣。因此,为了发现有意义的关联规则,需要给定两个阈值:最小支持度和最小可信度,前者规定了关联规则必须满足的最小支持度;后者规定了关联规则必须满足的最小可信度。一般称满足一定要求的(如较大的支持度和可信度)的规则为强规则(Strong rules)。
在关联规则的挖掘中要注意以下几点:
1、充分理解数据。
2、目标明确。
3、数据准备工作要做好。能否做好数据准备又取决于前两点。数据准备将直接影响到问题的复杂度及目标的实现。
4、选取恰当的最小支持度和最小可信度。这依赖于用户对目标的估计,如果取值过小,那么会发现大量无用的规则,不但影响执行效率、浪费系统资源,而且可能把目标埋没;如果取值过大,则又有可能找不到规则,与知识失之交臂。
5、很好地理解关联规则。数据挖掘工具能够发现满足条件的关联规则,但它不能判定关联规则的实际意义。对关联规则的理解需要熟悉业务背景,丰富的业务经验对数据有足够的理解。在发现的关联规则中,可能有两个主观上认为没有多大关系的物品,它们的关联规则支持度和可信度却很高,需要根据业务知识、经验,从各个角度判断这是一个偶然现象或有其内在的合理性;反之,可能有主观上认为关系密切的物品,结果却显示它们之间相关性不强。只有很好的理解关联规则,才能去其糟粕,取其精华,充分发挥关联规则的价值。
发现关联规则要经过以下三个步骤:
1、连接数据,作数据准备;
2、给定最小支持度和最小可信度,利用数据挖掘工具提供的算法发现关联规则;
3、可视化显示、理解、评估关联规则。
三 、关联规则挖掘的过程
关联规则挖掘过程主要包含两个阶段:
第一阶段必须先从资料集合中找出所有的高频项目组(Frequent Itemsets),
第二阶段再由这些高频项目组中产生关联规则(Association Rules)。
关联规则挖掘的第一阶段必须从原始资料集合中,找出所有高频项目组(Large Itemsets)。高频的意思是指某一项目组出现的频率相对于所有记录而言,必须达到某一水平。一项目组出现的频率称为支持度(Support),以一个包含A与B两个项目的2-itemset为例,我们可以经由公式(1)求得包含{A,B}项目组的支持度,若支持度大于等于所设定的最小支持度(Minimum Support)门槛值时,则{A,B}称为高频项目组。一个满足最小支持度的k-itemset,则称为高频k-项目组(Frequent k-itemset),一般表示为Large k或Frequent k。算法并从Large k的项目组中再产生Large k+1,直到无法再找到更长的高频项目组为止。
关联规则挖掘的第二阶段是要产生关联规则(Association Rules)。从高频项目组产生关联规则,是利用前一步骤的高频k-项目组来产生规则,在最小信赖度(Minimum Confidence)的条件门槛下,若一规则所求得的信赖度满足最小信赖度,称此规则为关联规则。
从上面的介绍还可以看出,关联规则挖掘通常比较适用与记录中的指标取离散值的情况。如果原始数据库中的指标值是取连续的数据,则在关联规则挖掘之前应该进行适当的数据离散化(实际上就是将某个区间的值对应于某个值),数据的离散化是数据挖掘前的重要环节,离散化的过程是否合理将直接影响关联规则的挖掘结果。
四、 关联规则的分类
按照不同情况,关联规则可以进行分类如下:
1.基于规则中处理的变量的类别,关联规则可以分为布尔型和数值型。
布尔型关联规则处理的值都是离散的、种类化的,它显示了这些变量之间的关系;而数值型关联规则可以和多维关联或多层关联规则结合起来,对数值型字段进行处理,将其进行动态的分割,或者直接对原始的数据进行处理,当然数值型关联规则中也可以包含种类变量。例如:性别=“女”=>职业=“秘书” ,是布尔型关联规则;性别=“女”=>avg(收入)=2300,涉及的收入是数值类型,所以是一个数值型关联规则。
2.基于规则中数据的抽象层次,可以分为单层关联规则和多层关联规则。
在单层的关联规则中,所有的变量都没有考虑到现实的数据是具有多个不同的层次的;而在多层的关联规则中,对数据的多层性已经进行了充分的考虑。例如:IBM台式机=>Sony打印机,是一个细节数据上的单层关联规则;台式机=>Sony打印机,是一个较高层次和细节层次之间的多层关联规则。
3.基于规则中涉及到的数据的维数,关联规则可以分为单维的和多维的。
在单维的关联规则中,我们只涉及到数据的一个维,如用户购买的物品;而在多维的关联规则中,要处理的数据将会涉及多个维。换成另一句话,单维关联规则是处理单个属性中的一些关系;多维关联规则是处理各个属性之间的某些关系。例如:啤酒=>尿布,这条规则只涉及到用户的购买的物品;性别=“女”=>职业=“秘书”,这条规则就涉及到两个字段的信息,是两个维上的一条关联规则。
5. 关联规则挖掘的相关算法
1.Apriori算法:使用候选项集找频繁项集
Apriori算法是一种最有影响的挖掘布尔关联规则频繁项集的算法。其核心是基于两阶段频集思想的递推算法。该关联规则在分类上属于单维、单层、布尔关联规则。在这里,所有支持度大于最小支持度的项集称为频繁项集,简称频集。
该算法的基本思想是:首先找出所有的频集,这些项集出现的频繁性至少和预定义的最小支持度一样。然后由频集产生强关联规则,这些规则必须满足最小支持度和最小可信度。然后使用第1步找到的频集产生期望的规则,产生只包含集合的项的所有规则,其中每一条规则的右部只有一项,这里采用的是中规则的定义。一旦这些规则被生成,那么只有那些大于用户给定的最小可信度的规则才被留下来。为了生成所有频集,使用了递推的方法。
可能产生大量的候选集,以及可能需要重复扫描数据库,是Apriori算法的两大缺点。
2.基于划分的算法
Savasere等设计了一个基于划分的算法。这个算法先把数据库从逻辑上分成几个互不相交的块,每次单独考虑一个分块并对它生成所有的频集,然后把产生的频集合并,用来生成所有可能的频集,最后计算这些项集的支持度。这里分块的大小选择要使得每个分块可以被放入主存,每个阶段只需被扫描一次。而算法的正确性是由每一个可能的频集至少在某一个分块中是频集保证的。该算法是可以高度并行的,可以把每一分块分别分配给某一个处理器生成频集。产生频集的每一个循环结束后,处理器之间进行通信来产生全局的候选k-项集。通常这里的通信过程是算法执行时间的主要瓶颈;而另一方面,每个独立的处理器生成频集的时间也是一个瓶颈。
3.FP-树频集算法
针对Apriori算法的固有缺陷,J. Han等提出了不产生候选挖掘频繁项集的方法:FP-树频集算法。采用分而治之的策略,在经过第一遍扫描之后,把数据库中的频集压缩进一棵频繁模式树(FP-tree),同时依然保留其中的关联信息,随后再将FP-tree分化成一些条件库,每个库和一个长度为1的频集相关,然后再对这些条件库分别进行挖掘。当原始数据量很大的时候,也可以结合划分的方法,使得一个FP-tree可以放入主存中。实验表明,FP-growth对不同长度的规则都有很好的适应性,同时在效率上较之Apriori算法有巨大的提高。
五、关联规则发掘技术在国内外的应用
就目前而言,关联规则挖掘技术已经被广泛应用在西方金融行业企业中,它可以成功预测银行客户需求。一旦获得了这些信息,银行就可以改善自身营销。现在银行天天都在开发新的沟通客户的方法。各银行在自己的ATM机上就捆绑了顾客可能感兴趣的本行产品信息,供使用本行ATM机的用户了解。如果数据库中显示,某个高信用限额的客户更换了地址,这个客户很有可能新近购买了一栋更大的住宅,因此会有可能需要更高信用限额,更高端的新信用卡,或者需要一个住房改善贷款,这些产品都可以通过信用卡账单邮寄给客户。当客户打电话咨询的时候,数据库可以有力地帮助电话销售代表。销售代表的电脑屏幕上可以显示出客户的特点,同时也可以显示出顾客会对什么产品感兴趣。
同时,一些知名的电子商务站点也从强大的关联规则挖掘中的受益。这些电子购物网站使用关联规则中规则进行挖掘,然后设置用户有意要一起购买的捆绑包。也有一些购物网站使用它们设置相应的交叉销售,也就是购买某种商品的顾客会看到相关的另外一种商品的广告。
但是目前在我国,“数据海量,信息缺乏”是商业银行在数据大集中之后普遍所面对的尴尬。目前金融业实施的大多数数据库只能实现数据的录入、查询、统计等较低层次的功能,却无法发现数据中存在的各种有用的信息,譬如对这些数据进行分析,发现其数据模式及特征,然后可能发现某个客户、消费群体或组织的金融和商业兴趣,并可观察金融市场的变化趋势。可以说,关联规则挖掘的技术在我国的研究与应用并不是很广泛深入。
近年来关联规则发掘技术的一些研究
由于许多应用问题往往比超市购买问题更复杂,大量研究从不同的角度对关联规则做了扩展,将更多的因素集成到关联规则挖掘方法之中,以此丰富关联规则的应用领域,拓宽支持管理决策的范围。如考虑属性之间的类别层次关系,时态关系,多表挖掘等。近年来围绕关联规则的研究主要集中于两个方面,即扩展经典关联规则能够解决问题的范围,改善经典关联规则挖掘算法效率和规则兴趣性。
在数据挖掘的知识模式中,关联规则模式是比较重要的一种。关联规则的概念由Agrawal、Imielinski、Swami 提出,是数据中一种简单但很实用的规则。关联规则模式属于描述型模式,发现关联规则的算法属于无监督学习的方法。
一、关联规则的定义和属性
考察一些涉及许多物品的事务:事务1 中出现了物品甲,事务2 中出现了物品乙,事务3 中则同时出现了物品甲和乙。那么,物品甲和乙在事务中的出现相互之间是否有规律可循呢?在数据库的知识发现中,关联规则就是描述这种在一个事务中物品之间同时出现的规律的知识模式。更确切的说,关联规则通过量化的数字描述物品甲的出现对物品乙的出现有多大的影响。
现实中,这样的例子很多。例如超级市场利用前端收款机收集存储了大量的售货数据,这些数据是一条条的购买事务记录,每条记录存储了事务处理时间,顾客购买的物品、物品的数量及金额等。这些数据中常常隐含形式如下的关联规则:在购买铁锤的顾客当中,有70 %的人同时购买了铁钉。这些关联规则很有价值,商场管理人员可以根据这些关联规则更好地规划商场,如把铁锤和铁钉这样的商品摆放在一起,能够促进销售。
有些数据不像售货数据那样很容易就能看出一个事务是许多物品的集合,但稍微转换一下思考角度,仍然可以像售货数据一样处理。比如人寿保险,一份保单就是一个事务。保险公司在接受保险前,往往需要记录投保人详尽的信息,有时还要到医院做身体检查。保单上记录有投保人的年龄、性别、健康状况、工作单位、工作地址、工资水平等。这些投保人的个人信息就可以看作事务中的物品。通过分析这些数据,可以得到类似以下这样的关联规则:年龄在40 岁以上,工作在A 区的投保人当中,有45 %的人曾经向保险公司索赔过。在这条规则中,“年龄在40 岁以上”是物品甲,“工作在A 区”是物品乙,“向保险公司索赔过”则是物品丙。可以看出来,A 区可能污染比较严重,环境比较差,导致工作在该区的人健康状况不好,索赔率也相对比较高。
设R= { I1,I2 ......Im} 是一组物品集,W 是一组事务集。W 中的每个事务T 是一组物品,T R。假设有一个物品集A,一个事务T,如果A T,则称事务T 支持物品集A。关联规则是如下形式的一种蕴含:A→B,其中A、B 是两组物品,A I,B I,且A ∩B=。一般用四个参数来描述一个关联规则的属性:
1 .可信度(Confidence)
设W 中支持物品集A 的事务中,有c %的事务同时也支持物品集B,c %称为关联规则A→B 的可信度。简单地说,可信度就是指在出现了物品集A 的事务T 中,物品集B 也同时出现的概率有多大。如上面所举的铁锤和铁钉的例子,该关联规则的可信度就回答了这样一个问题:如果一个顾客购买了铁锤,那么他也购买铁钉的可能性有多大呢?在上述例子中,购买铁锤的顾客中有70 %的人购买了铁钉, 所以可信度是70 %。
2 .支持度(Support)
设W 中有s %的事务同时支持物品集A 和B,s %称为关联规则A→B 的支持度。支持度描述了A 和B 这两个物品集的并集C 在所有的事务中出现的概率有多大。如果某天共有1000 个顾客到商场购买物品,其中有100 个顾客同时购买了铁锤和铁钉,那么上述的关联规则的支持度就是10 %。
3 .期望可信度(Expected confidence)
设W 中有e %的事务支持物品集B,e %称为关联规则A→B 的期望可信度度。期望可信度描述了在没有任何条件影响时,物品集B 在所有事务中出现的概率有多大。如果某天共有1000 个顾客到商场购买物品,其中有200 个顾客购买了铁钉,则上述的关联规则的期望可信度就是20 %。
4 .作用度(Lift)
作用度是可信度与期望可信度的比值。作用度描述物品集A 的出现对物品集B 的出现有多大的影响。因为物品集B 在所有事务中出现的概率是期望可信度;而物品集B 在有物品集A 出现的事务中出现的概率是可信度,通过可信度对期望可信度的比值反映了在加入“物品集A 出现”的这个条件后,物品集B 的出现概率发生了多大的变化。在上例中作用度就是70 %/20 %=3.5。
可信度是对关联规则的准确度的衡量,支持度是对关联规则重要性的衡量。支持度说明了这条规则在所有事务中有多大的代表性,显然支持度越大,关联规则越重要。有些关联规则可信度虽然很高,但支持度却很低,说明该关联规则实用的机会很小,因此也不重要。
期望可信度描述了在没有物品集A 的作用下,物品集B 本身的支持度;作用度描述了物品集A 对物品集B 的影响力的大小。作用度越大,说明物品集B 受物品集A 的影响越大。一般情况,有用的关联规则的作用度都应该大于1,只有关联规则的可信度大于期望可信度,才说明A 的出现对B 的出现有促进作用,也说明了它们之间某种程度的相关性,如果作用度不大于1,则此关联规则也就没有意义了。
二、关联规则的挖掘
在关联规则的四个属性中,支持度和可信度能够比较直接形容关联规则的性质。从关联规则定义可以看出,任意给出事务中的两个物品集,它们之间都存在关联规则,只不过属性值有所不同。如果不考虑关联规则的支持度和可信度,那么在事务数据库中可以发现无穷多的关联规则。事实上,人们一般只对满足一定的支持度和可信度的关联规则感兴趣。因此,为了发现有意义的关联规则,需要给定两个阈值:最小支持度和最小可信度,前者规定了关联规则必须满足的最小支持度;后者规定了关联规则必须满足的最小可信度。一般称满足一定要求的(如较大的支持度和可信度)的规则为强规则(Strong rules)。
在关联规则的挖掘中要注意以下几点:
1、充分理解数据。
2、目标明确。
3、数据准备工作要做好。能否做好数据准备又取决于前两点。数据准备将直接影响到问题的复杂度及目标的实现。
4、选取恰当的最小支持度和最小可信度。这依赖于用户对目标的估计,如果取值过小,那么会发现大量无用的规则,不但影响执行效率、浪费系统资源,而且可能把目标埋没;如果取值过大,则又有可能找不到规则,与知识失之交臂。
5、很好地理解关联规则。数据挖掘工具能够发现满足条件的关联规则,但它不能判定关联规则的实际意义。对关联规则的理解需要熟悉业务背景,丰富的业务经验对数据有足够的理解。在发现的关联规则中,可能有两个主观上认为没有多大关系的物品,它们的关联规则支持度和可信度却很高,需要根据业务知识、经验,从各个角度判断这是一个偶然现象或有其内在的合理性;反之,可能有主观上认为关系密切的物品,结果却显示它们之间相关性不强。只有很好的理解关联规则,才能去其糟粕,取其精华,充分发挥关联规则的价值。
发现关联规则要经过以下三个步骤:
1、连接数据,作数据准备;
2、给定最小支持度和最小可信度,利用数据挖掘工具提供的算法发现关联规则;
3、可视化显示、理解、评估关联规则。
三 、关联规则挖掘的过程
关联规则挖掘过程主要包含两个阶段:
第一阶段必须先从资料集合中找出所有的高频项目组(Frequent Itemsets),
第二阶段再由这些高频项目组中产生关联规则(Association Rules)。
关联规则挖掘的第一阶段必须从原始资料集合中,找出所有高频项目组(Large Itemsets)。高频的意思是指某一项目组出现的频率相对于所有记录而言,必须达到某一水平。一项目组出现的频率称为支持度(Support),以一个包含A与B两个项目的2-itemset为例,我们可以经由公式(1)求得包含{A,B}项目组的支持度,若支持度大于等于所设定的最小支持度(Minimum Support)门槛值时,则{A,B}称为高频项目组。一个满足最小支持度的k-itemset,则称为高频k-项目组(Frequent k-itemset),一般表示为Large k或Frequent k。算法并从Large k的项目组中再产生Large k+1,直到无法再找到更长的高频项目组为止。
关联规则挖掘的第二阶段是要产生关联规则(Association Rules)。从高频项目组产生关联规则,是利用前一步骤的高频k-项目组来产生规则,在最小信赖度(Minimum Confidence)的条件门槛下,若一规则所求得的信赖度满足最小信赖度,称此规则为关联规则。
从上面的介绍还可以看出,关联规则挖掘通常比较适用与记录中的指标取离散值的情况。如果原始数据库中的指标值是取连续的数据,则在关联规则挖掘之前应该进行适当的数据离散化(实际上就是将某个区间的值对应于某个值),数据的离散化是数据挖掘前的重要环节,离散化的过程是否合理将直接影响关联规则的挖掘结果。
四、 关联规则的分类
按照不同情况,关联规则可以进行分类如下:
1.基于规则中处理的变量的类别,关联规则可以分为布尔型和数值型。
布尔型关联规则处理的值都是离散的、种类化的,它显示了这些变量之间的关系;而数值型关联规则可以和多维关联或多层关联规则结合起来,对数值型字段进行处理,将其进行动态的分割,或者直接对原始的数据进行处理,当然数值型关联规则中也可以包含种类变量。例如:性别=“女”=>职业=“秘书” ,是布尔型关联规则;性别=“女”=>avg(收入)=2300,涉及的收入是数值类型,所以是一个数值型关联规则。
2.基于规则中数据的抽象层次,可以分为单层关联规则和多层关联规则。
在单层的关联规则中,所有的变量都没有考虑到现实的数据是具有多个不同的层次的;而在多层的关联规则中,对数据的多层性已经进行了充分的考虑。例如:IBM台式机=>Sony打印机,是一个细节数据上的单层关联规则;台式机=>Sony打印机,是一个较高层次和细节层次之间的多层关联规则。
3.基于规则中涉及到的数据的维数,关联规则可以分为单维的和多维的。
在单维的关联规则中,我们只涉及到数据的一个维,如用户购买的物品;而在多维的关联规则中,要处理的数据将会涉及多个维。换成另一句话,单维关联规则是处理单个属性中的一些关系;多维关联规则是处理各个属性之间的某些关系。例如:啤酒=>尿布,这条规则只涉及到用户的购买的物品;性别=“女”=>职业=“秘书”,这条规则就涉及到两个字段的信息,是两个维上的一条关联规则。
5. 关联规则挖掘的相关算法
1.Apriori算法:使用候选项集找频繁项集
Apriori算法是一种最有影响的挖掘布尔关联规则频繁项集的算法。其核心是基于两阶段频集思想的递推算法。该关联规则在分类上属于单维、单层、布尔关联规则。在这里,所有支持度大于最小支持度的项集称为频繁项集,简称频集。
该算法的基本思想是:首先找出所有的频集,这些项集出现的频繁性至少和预定义的最小支持度一样。然后由频集产生强关联规则,这些规则必须满足最小支持度和最小可信度。然后使用第1步找到的频集产生期望的规则,产生只包含集合的项的所有规则,其中每一条规则的右部只有一项,这里采用的是中规则的定义。一旦这些规则被生成,那么只有那些大于用户给定的最小可信度的规则才被留下来。为了生成所有频集,使用了递推的方法。
可能产生大量的候选集,以及可能需要重复扫描数据库,是Apriori算法的两大缺点。
2.基于划分的算法
Savasere等设计了一个基于划分的算法。这个算法先把数据库从逻辑上分成几个互不相交的块,每次单独考虑一个分块并对它生成所有的频集,然后把产生的频集合并,用来生成所有可能的频集,最后计算这些项集的支持度。这里分块的大小选择要使得每个分块可以被放入主存,每个阶段只需被扫描一次。而算法的正确性是由每一个可能的频集至少在某一个分块中是频集保证的。该算法是可以高度并行的,可以把每一分块分别分配给某一个处理器生成频集。产生频集的每一个循环结束后,处理器之间进行通信来产生全局的候选k-项集。通常这里的通信过程是算法执行时间的主要瓶颈;而另一方面,每个独立的处理器生成频集的时间也是一个瓶颈。
3.FP-树频集算法
针对Apriori算法的固有缺陷,J. Han等提出了不产生候选挖掘频繁项集的方法:FP-树频集算法。采用分而治之的策略,在经过第一遍扫描之后,把数据库中的频集压缩进一棵频繁模式树(FP-tree),同时依然保留其中的关联信息,随后再将FP-tree分化成一些条件库,每个库和一个长度为1的频集相关,然后再对这些条件库分别进行挖掘。当原始数据量很大的时候,也可以结合划分的方法,使得一个FP-tree可以放入主存中。实验表明,FP-growth对不同长度的规则都有很好的适应性,同时在效率上较之Apriori算法有巨大的提高。
五、关联规则发掘技术在国内外的应用
就目前而言,关联规则挖掘技术已经被广泛应用在西方金融行业企业中,它可以成功预测银行客户需求。一旦获得了这些信息,银行就可以改善自身营销。现在银行天天都在开发新的沟通客户的方法。各银行在自己的ATM机上就捆绑了顾客可能感兴趣的本行产品信息,供使用本行ATM机的用户了解。如果数据库中显示,某个高信用限额的客户更换了地址,这个客户很有可能新近购买了一栋更大的住宅,因此会有可能需要更高信用限额,更高端的新信用卡,或者需要一个住房改善贷款,这些产品都可以通过信用卡账单邮寄给客户。当客户打电话咨询的时候,数据库可以有力地帮助电话销售代表。销售代表的电脑屏幕上可以显示出客户的特点,同时也可以显示出顾客会对什么产品感兴趣。
同时,一些知名的电子商务站点也从强大的关联规则挖掘中的受益。这些电子购物网站使用关联规则中规则进行挖掘,然后设置用户有意要一起购买的捆绑包。也有一些购物网站使用它们设置相应的交叉销售,也就是购买某种商品的顾客会看到相关的另外一种商品的广告。
但是目前在我国,“数据海量,信息缺乏”是商业银行在数据大集中之后普遍所面对的尴尬。目前金融业实施的大多数数据库只能实现数据的录入、查询、统计等较低层次的功能,却无法发现数据中存在的各种有用的信息,譬如对这些数据进行分析,发现其数据模式及特征,然后可能发现某个客户、消费群体或组织的金融和商业兴趣,并可观察金融市场的变化趋势。可以说,关联规则挖掘的技术在我国的研究与应用并不是很广泛深入。
近年来关联规则发掘技术的一些研究
由于许多应用问题往往比超市购买问题更复杂,大量研究从不同的角度对关联规则做了扩展,将更多的因素集成到关联规则挖掘方法之中,以此丰富关联规则的应用领域,拓宽支持管理决策的范围。如考虑属性之间的类别层次关系,时态关系,多表挖掘等。近年来围绕关联规则的研究主要集中于两个方面,即扩展经典关联规则能够解决问题的范围,改善经典关联规则挖掘算法效率和规则兴趣性。
Tuesday, May 25, 2010
贝叶斯网络简单介绍
1. 贝叶斯网络是一种概率网络,它是基于概率推理的图形化网络,而贝叶斯公式则是这个概率网络的基础。贝叶斯网络是基于概率推理的数学模型,所谓概率推理就是通过一些变量的信息来获取其他的概率信息的过程,基于概率推理的贝叶斯网络(Bayesian network)是为了解决不定性和不完整性问题而提出的,它对于解决复杂设备不确定性和关联性引起的故障有很的优势,在多个领域中获得广泛应用。
2. 贝叶斯网络又称信度网络,是Bayes方法的扩展,目前不确定知识表达和推理领域最有效的理论模型之一。从1988年由Pearl提出后,已知成为近几年来研究的热点.。一个贝叶斯网络是一个有向无环图(Directed Acyclic Graph,DAG),由代表变量节点及连接这些节点有向边构成。节点代表随机变量,节点间的有向边代表了节点间的互相关系(由父节点指向其后代节点),用条件概率进行表达关系强度,没有父节点的用先验概率进行信息表达。节点变量可以是任何问题的抽象,如:测试值,观测现象,意见征询等。适用于表达和分析不确定性和概率性的事件,应用于有条件地依赖多种控制因素的决策,可以从不完全。不精确或不确定的知识或信息中做出推理。
3. 贝叶斯网络建造
贝叶斯网络的建造是一个复杂的任务,需要知识工程师和领域专家的参与。在实际中可能是反复交叉进行而不断完善的。面向设备故障诊断应用的贝叶斯网络的建造所需要的信息来自多种渠道,如设备手册,生产过程,测试过程,维修资料以及专家经验等。首先将设备故障分为各个相互独立且完全包含的类别(各故障类别至少应该具有可以区分的界限),然后对各个故障类别分别建造贝叶斯网络模型,需要注意的是诊断模型只在发生故障时启动,因此无需对设备正常状态建模。通常设备故障由一个或几个原因造成的,这些原因又可能由一个或几个更低层次的原因造成。建立起网络的节点关系后,还需要进行概率估计。具体方法是假设在某故障原因出现的情况下,估计该故障原因的各个节点的条件概率,这种局部化概率估计的方法可以大大提高效率。
贝叶斯网络具有如下特性:
1。贝叶斯网络本身是一种不定性因果关联模型。贝叶斯网络与其他决策模型不同,它本身
是将多元知识图解可视化的一种概率知识表达与推理模型,更为贴切地蕴含了网络节点
变量之间的因果关系及条件相关关系。
2。贝叶斯网络具有强大的不确定性问题处理能力。贝叶斯网络用条件概率表达各个信息要
素之间的相关关系,能在有限的,不完整的,不确定的信息条件下进行学习和推理。
3。贝叶斯网络能有效地进行多源信息表达与融合。贝叶斯网络可将故障诊断与维修决策
相关的各种信息纳入网络结构中,按节点的方式统一进行处理,能有效地按信息的相关
关系进行融合。
目前对于贝叶斯网络推理研究中提出了多种近似推理算法,主要分为两大类:基于仿真方法和基于搜索的方法。在故障诊断领域里就我们水电仿真而言,往往故障概率很小,所以一般采用搜索推理算法较适合。就一个实例而言,首先要分析使用那种算法模型:
a.)如果该实例节点信度网络是简单的有向图结构,它的节点数目少的情况下,采用贝叶斯网络的精确推理,它包含多树传播算法,团树传播算法,图约减算法,针对实例事件进行选择恰当的算法;
b.)如果是该实例所画出节点图形结构复杂且节点数目多,我们可采用近似推理算法去研究,具体实施起来最好能把复杂庞大的网络进行化简,然后在与精确推理相结合来考虑。
2. 贝叶斯网络又称信度网络,是Bayes方法的扩展,目前不确定知识表达和推理领域最有效的理论模型之一。从1988年由Pearl提出后,已知成为近几年来研究的热点.。一个贝叶斯网络是一个有向无环图(Directed Acyclic Graph,DAG),由代表变量节点及连接这些节点有向边构成。节点代表随机变量,节点间的有向边代表了节点间的互相关系(由父节点指向其后代节点),用条件概率进行表达关系强度,没有父节点的用先验概率进行信息表达。节点变量可以是任何问题的抽象,如:测试值,观测现象,意见征询等。适用于表达和分析不确定性和概率性的事件,应用于有条件地依赖多种控制因素的决策,可以从不完全。不精确或不确定的知识或信息中做出推理。
3. 贝叶斯网络建造
贝叶斯网络的建造是一个复杂的任务,需要知识工程师和领域专家的参与。在实际中可能是反复交叉进行而不断完善的。面向设备故障诊断应用的贝叶斯网络的建造所需要的信息来自多种渠道,如设备手册,生产过程,测试过程,维修资料以及专家经验等。首先将设备故障分为各个相互独立且完全包含的类别(各故障类别至少应该具有可以区分的界限),然后对各个故障类别分别建造贝叶斯网络模型,需要注意的是诊断模型只在发生故障时启动,因此无需对设备正常状态建模。通常设备故障由一个或几个原因造成的,这些原因又可能由一个或几个更低层次的原因造成。建立起网络的节点关系后,还需要进行概率估计。具体方法是假设在某故障原因出现的情况下,估计该故障原因的各个节点的条件概率,这种局部化概率估计的方法可以大大提高效率。
贝叶斯网络具有如下特性:
1。贝叶斯网络本身是一种不定性因果关联模型。贝叶斯网络与其他决策模型不同,它本身
是将多元知识图解可视化的一种概率知识表达与推理模型,更为贴切地蕴含了网络节点
变量之间的因果关系及条件相关关系。
2。贝叶斯网络具有强大的不确定性问题处理能力。贝叶斯网络用条件概率表达各个信息要
素之间的相关关系,能在有限的,不完整的,不确定的信息条件下进行学习和推理。
3。贝叶斯网络能有效地进行多源信息表达与融合。贝叶斯网络可将故障诊断与维修决策
相关的各种信息纳入网络结构中,按节点的方式统一进行处理,能有效地按信息的相关
关系进行融合。
目前对于贝叶斯网络推理研究中提出了多种近似推理算法,主要分为两大类:基于仿真方法和基于搜索的方法。在故障诊断领域里就我们水电仿真而言,往往故障概率很小,所以一般采用搜索推理算法较适合。就一个实例而言,首先要分析使用那种算法模型:
a.)如果该实例节点信度网络是简单的有向图结构,它的节点数目少的情况下,采用贝叶斯网络的精确推理,它包含多树传播算法,团树传播算法,图约减算法,针对实例事件进行选择恰当的算法;
b.)如果是该实例所画出节点图形结构复杂且节点数目多,我们可采用近似推理算法去研究,具体实施起来最好能把复杂庞大的网络进行化简,然后在与精确推理相结合来考虑。
贝叶斯网络的发展及理论应用
1 引言
在人工智能领域,贝叶斯方法是一种非常有代表性的不确定性知识表示和推理方法。在贝叶斯方法中,由于全联合概率公式假设所有变量之间都具有条件依赖性,其计算复杂,使用中采用朴素贝叶斯分类器的简化形式。但是朴素分类器假设所有变量之间都是条件独立的,于实际不是很相符。而贝叶斯网络充分利用了变量之间的独立性和条件独立性关系,大大减小了为定义全联合概率分布所需要指定的概率数目,同时也避免了朴素贝叶斯分类器要求所有变量都是独立的不足,是一个很好的折中办法。作为一种特殊的建模方式,贝叶斯网络在各领域的应用也越来越广泛。
2 贝叶斯网络
2.1 贝叶斯网络的基本概念
贝叶斯网络是一种概率网络,用于表示变量之间的依赖关系,带有概率分布标注的有向无环图,能够图形化地表示一组变量间的联合概率分布函数。
贝叶斯网络模型结构由随机变量(可以是离散或连续)集组成的网络节点,具有因果关系的网络节点对的有向边集合和用条件概率分布表示节点之间的影响等组成。其中节点表示了随机变量,是对过程、事件、状态等实体的某些特征的描述;边则表示变量间的概率依赖关系。起因的假设和结果的数据均用节点表示,各变量之间的因果关系由节点之间的有向边表示,一个变量影响到另一个变量的程度用数字编码形式描述。因此贝叶斯网络可以将现实世界的各种状态或变量画成各种比例,进行建模。
贝叶斯网络中包括两个重要的独立关系性,其一是节点与他的非后代节点是条件独立的;其二是给定一个节点的马尔可夫覆盖,这个节点和网络中的所有其他节点是条件独立。马尔可夫覆盖在贝叶斯网络的推理中起到非常重要的作用。
2.2 贝叶斯网络的推理
贝叶斯网络的推理有精确推理和近似推理。利用枚举推理的方法可以实现精确推理,任何条件概率都可以通过全联合概率分布表重点某些项相加计算而得到,有如下条件概率公式:
(1)
其中X为查询变量,e为事件,y为未观测变量(即隐变量), α为归一化常数。根据贝叶斯的语义计算公式,可将上述联合概率分布中的项写成条件概率乘积的形式。因此推理结果可通过计算条件概率的乘积并求和。
在实际应用中,对于大规模多连通的贝叶斯网络而言,精确推理是不可操作的。因此需要引入近似推理方法——马尔可夫链蒙特卡罗(MCMC)方法。MCMC算法是最近发展起来的一种简单且行之有效的贝叶斯方法。它的基本思想通过建立一个平稳分布 P(x)的马尔可夫链,得到分布P(x)的分布样本,基于这种样本作出各种统计推理。MCMC算法一般有两种形式,一种是Gibbs抽样,一种是Metropolis-Hastings算法。Brooks,Giuici详细介绍了Metropolis-Hastings算法在网络结构学习中的应用。学习方法的主要思想是利用Metropolis-Hastings算法,构造一个关于贝叶斯网络的马尔可夫链,平稳分布是后验概率分布p(G/D)[1]。
设G是一已知的贝叶斯网络结构,nbd(G)表示由G和那些对G实行一次边的简单操作(删除边、增加边、改变边的方向)得到的图构成的集合,成为G的邻近域。在利用算法构造马尔可夫链时,从G转移到G’的接受概率为
2.3 贝叶斯网络的特点
作为一种图形化的建模工具,贝叶斯网络具有一下几个特性:(1)贝叶斯网络将有向无环图与概率理论有机结合,不但具有正式的概率理论基础,同时也更具有直观的知识表示形式。一方面,它可以将人类所拥有的因果知识直接用有向图自然直观地表示出来,另一方面,也可以将统计数据以条件概率的形式融入模型。这样贝叶斯网络就能将人类的先验知识和后验的数据无缝地结合,克服框架、语义网络等模型仅能表达处理定量信息的弱点和神经网络等方法不够直观的缺点;(2)贝叶斯网络与一般知识表示方法不同的是对问题域的建模。因此当条件或行为等发生变化时,不用对模型进行修正;(3)贝叶斯网络可以图形化表示随机变量间的联合概率,因此能够处理各种不确定性信息;(4)贝叶斯网络中没有确定的输入输出节点,节点之间是相互影响的,任何节点观测值的获得或者对于任何节点的干涉,都会对其他节点造成影响,并可以利用贝叶斯网络推理来进行估计预测;(5)贝叶斯网络的推理是以贝叶斯概率理论为基础的,不需要外界任何推理机制,不但具有理论依据,而且将知识表示与知识推理结合起来,形成统一的整体[2-4]。
3 贝叶斯的结构学习
3.1 贝叶斯网络的结构学习
一般专业根据事物间的关系确定出贝叶斯网络的结构及每个节点的条件概率,不可避免其主观性。在没有专业先验知识的情况下,如何将专家知识和客观观测数据结合起来,共同构建贝叶斯网络,并学习网络结构和参数,是研究人员关注的问题。借鉴统计学领域对多变量联合概率分布近似分解的方法,从多个角度对此问题进行了研究,形成了基于独立性校验和基于评价与搜索的两大类算法[5-6]。贝叶斯网络结构的学习,通过数据的处理,发现事物间因果关系,获得结构模型,也称为因果挖掘。
3.2 贝叶斯网络的结构学习算法的发展及存在问题
20世纪80年代,研究人员根据主观的因果知识构建贝叶斯网络结构.1991年,Cooper和Herskovits提出的K2算法结合了先验信息进行贝叶斯网络结构学习,对推经贝叶斯结构学习算法,起到重要作用,1995年Singh和Valtorta提出一种混合算法,通过对基于独立性校验的算法——PC算法,进行改进来获得节点顺序木然后再用K2算法学习网络结构,此算法在没有先验知识情况下进行贝叶斯结构学习。随后1998年研究者又研究了基于校验变量间的独立性关系来构建网络的基于独立性校验的结构学习算法,即Boundary DAG算法。由结构学习算法的发展过程可见都是以构建因果贝叶斯网络模型为目的,似的研究对象是专家预先处理过的数据集合,算法则是根据这些变量之间的统计特性来推断出他们之间的因果关系。因此也存在一些问题。如马尔可夫等价类问题、前提假设过强问题。
同一马尔可夫等价类表示同样的独立性关系的网络结构,在没有专家先验知识的情下,无法通过观测数据来区分,这样网络中有些边的方向无法确定。
在贝叶斯网络机构学习中,算法的许多假设在实际中无法满足。仅此需要寻求更一般情况下的学习。
后来的一些算法也对上述不足做了一定程度的改进。
4 贝叶斯网络的理论应用
4.1 基于贝叶斯的遗传算法
人工智能的研究中,对于遗传算法所涉及的信号处理、模糊模式识别、多目标优化。模糊优化、可靠性设计等较复杂结构,往往有成千上万个变量,变量之间又以不可预测的方式影响其他的变量,若每一个变量以一般的选择、交叉、变异的遗传操作,难以实现群体内个体结构重组的迭代。基于贝叶斯网络的遗传算法,以贝叶斯网络按概率传播方法,将群体(问题的解)一代一代地以优化和推理,并逐渐比较最优解。保持了遗传算法的优点,而为对不确定性命题进行推理和搜索,从而拓展了遗传算法。
4.2 基于贝叶斯网络的多目标优化问题
多目标优化算法的研究目前成为人工智能领域的研究热点,对该探索的技术主要集中在利用进化计算各种各样的求解方法。但是标准的进化机制有普遍的缺点:(1)必须设置参数,如交叉、变异和选择概率,并且需要选择适当的遗传算子,而参数的规范性设置和遗传算子的选择一直没有得到有效地解决。(2)简单的交叉、变异算子有较高的倾向去打破或丢失既不块,而保证积木块的适当成长和混合对进化的成功很重要;(3)将每个候选解看作一个独立的个体,即忽略了候选解之间的相互关系。正是由于这些原因,在多目标进化计算的求解过程中存在着无效进化和计算浪费。
应该注意到,通过一定的选择机制,筛选出候选解集体现着个体之间的本质联系,代表着遗传算法的进化方向和强度,影响着算法的有效性。因此在进化过程中重视这一部分优良解集的整体属性,充分挖掘器信息内涵,以便利用这些信息确保积木合适地积累。研究的方法可以在进化机制上进行创新,寻找能够详细刻画个体之间本质联系的有效工具进行种群学习。
将贝叶斯网络(Bayesian Networks,BN)和种群优良解集联系起来,种群中的寻优信息体现在贝叶斯网络上,网络的结构对应于个体编码之间的相互联系,网络的参数属性对应于个体编码之间的联系程度,根据这种联系程度进行知识推理以实现进化信息的遗传,这种方法称为基于图形模型的进化算法,它是基于概率模型的遗传算法的进一步发展,日益引起人们的重视。对当前贝叶斯网络的度量机制和搜索机制进行分析,理论总是需要为实践服务的,提出并应用一个新的贝叶斯多目标优化算法是一个极有意义的研究方向。
5 结论
随着贝叶斯网络的不断发展,以及其显著的优点,将很快成为人工智能领域不确定性推理和建模的一个有效工具。利用贝叶斯网络对事件或属性间带有不确定性的相互关系进行建模和推理在决策、实现特征融合、进行分类的数据分析领域得到广泛应用。如在医学诊断、故障诊断和预测等方面也都有很多成果。
参考文献
[1] 史会峰,谷根代.基于MCMC算法贝叶斯网络的学习[J] .华北电力大学学报 .2004,31(4):109~112.
[2] Dawid A P. Applications of a general propagation algorithm for probalilistic expert systems[J],Statistics and Computing,1992,2(2):25~26 .
[3] Buntine W L.Operations for learning with graphical models[J] .Journal of Artificial Intelligence Research,1994,2:159~225.
[4] Laurizen S I,Spiegelhalter D J.Local comoutations with probabilities on graphical structures and their application to expert systems[J],Journal of the Royal Statistical Sosiety,1988,50(2):157~224.
[5] Chickering D M.Learning Bayesian networks is NP-complete[A] .Learning from Data:AI and Statisitcs V[M] .New York:Springer,1996 .121~130 .
[6] Chow C K,Liu C N .Approximating discrete probability distributions with dependence trees[J] .IEEE Transaction on Information Theory,1968,IT-14(3):462~467 .
在人工智能领域,贝叶斯方法是一种非常有代表性的不确定性知识表示和推理方法。在贝叶斯方法中,由于全联合概率公式假设所有变量之间都具有条件依赖性,其计算复杂,使用中采用朴素贝叶斯分类器的简化形式。但是朴素分类器假设所有变量之间都是条件独立的,于实际不是很相符。而贝叶斯网络充分利用了变量之间的独立性和条件独立性关系,大大减小了为定义全联合概率分布所需要指定的概率数目,同时也避免了朴素贝叶斯分类器要求所有变量都是独立的不足,是一个很好的折中办法。作为一种特殊的建模方式,贝叶斯网络在各领域的应用也越来越广泛。
2 贝叶斯网络
2.1 贝叶斯网络的基本概念
贝叶斯网络是一种概率网络,用于表示变量之间的依赖关系,带有概率分布标注的有向无环图,能够图形化地表示一组变量间的联合概率分布函数。
贝叶斯网络模型结构由随机变量(可以是离散或连续)集组成的网络节点,具有因果关系的网络节点对的有向边集合和用条件概率分布表示节点之间的影响等组成。其中节点表示了随机变量,是对过程、事件、状态等实体的某些特征的描述;边则表示变量间的概率依赖关系。起因的假设和结果的数据均用节点表示,各变量之间的因果关系由节点之间的有向边表示,一个变量影响到另一个变量的程度用数字编码形式描述。因此贝叶斯网络可以将现实世界的各种状态或变量画成各种比例,进行建模。
贝叶斯网络中包括两个重要的独立关系性,其一是节点与他的非后代节点是条件独立的;其二是给定一个节点的马尔可夫覆盖,这个节点和网络中的所有其他节点是条件独立。马尔可夫覆盖在贝叶斯网络的推理中起到非常重要的作用。
2.2 贝叶斯网络的推理
贝叶斯网络的推理有精确推理和近似推理。利用枚举推理的方法可以实现精确推理,任何条件概率都可以通过全联合概率分布表重点某些项相加计算而得到,有如下条件概率公式:
(1)
其中X为查询变量,e为事件,y为未观测变量(即隐变量), α为归一化常数。根据贝叶斯的语义计算公式,可将上述联合概率分布中的项写成条件概率乘积的形式。因此推理结果可通过计算条件概率的乘积并求和。
在实际应用中,对于大规模多连通的贝叶斯网络而言,精确推理是不可操作的。因此需要引入近似推理方法——马尔可夫链蒙特卡罗(MCMC)方法。MCMC算法是最近发展起来的一种简单且行之有效的贝叶斯方法。它的基本思想通过建立一个平稳分布 P(x)的马尔可夫链,得到分布P(x)的分布样本,基于这种样本作出各种统计推理。MCMC算法一般有两种形式,一种是Gibbs抽样,一种是Metropolis-Hastings算法。Brooks,Giuici详细介绍了Metropolis-Hastings算法在网络结构学习中的应用。学习方法的主要思想是利用Metropolis-Hastings算法,构造一个关于贝叶斯网络的马尔可夫链,平稳分布是后验概率分布p(G/D)[1]。
设G是一已知的贝叶斯网络结构,nbd(G)表示由G和那些对G实行一次边的简单操作(删除边、增加边、改变边的方向)得到的图构成的集合,成为G的邻近域。在利用算法构造马尔可夫链时,从G转移到G’的接受概率为
2.3 贝叶斯网络的特点
作为一种图形化的建模工具,贝叶斯网络具有一下几个特性:(1)贝叶斯网络将有向无环图与概率理论有机结合,不但具有正式的概率理论基础,同时也更具有直观的知识表示形式。一方面,它可以将人类所拥有的因果知识直接用有向图自然直观地表示出来,另一方面,也可以将统计数据以条件概率的形式融入模型。这样贝叶斯网络就能将人类的先验知识和后验的数据无缝地结合,克服框架、语义网络等模型仅能表达处理定量信息的弱点和神经网络等方法不够直观的缺点;(2)贝叶斯网络与一般知识表示方法不同的是对问题域的建模。因此当条件或行为等发生变化时,不用对模型进行修正;(3)贝叶斯网络可以图形化表示随机变量间的联合概率,因此能够处理各种不确定性信息;(4)贝叶斯网络中没有确定的输入输出节点,节点之间是相互影响的,任何节点观测值的获得或者对于任何节点的干涉,都会对其他节点造成影响,并可以利用贝叶斯网络推理来进行估计预测;(5)贝叶斯网络的推理是以贝叶斯概率理论为基础的,不需要外界任何推理机制,不但具有理论依据,而且将知识表示与知识推理结合起来,形成统一的整体[2-4]。
3 贝叶斯的结构学习
3.1 贝叶斯网络的结构学习
一般专业根据事物间的关系确定出贝叶斯网络的结构及每个节点的条件概率,不可避免其主观性。在没有专业先验知识的情况下,如何将专家知识和客观观测数据结合起来,共同构建贝叶斯网络,并学习网络结构和参数,是研究人员关注的问题。借鉴统计学领域对多变量联合概率分布近似分解的方法,从多个角度对此问题进行了研究,形成了基于独立性校验和基于评价与搜索的两大类算法[5-6]。贝叶斯网络结构的学习,通过数据的处理,发现事物间因果关系,获得结构模型,也称为因果挖掘。
3.2 贝叶斯网络的结构学习算法的发展及存在问题
20世纪80年代,研究人员根据主观的因果知识构建贝叶斯网络结构.1991年,Cooper和Herskovits提出的K2算法结合了先验信息进行贝叶斯网络结构学习,对推经贝叶斯结构学习算法,起到重要作用,1995年Singh和Valtorta提出一种混合算法,通过对基于独立性校验的算法——PC算法,进行改进来获得节点顺序木然后再用K2算法学习网络结构,此算法在没有先验知识情况下进行贝叶斯结构学习。随后1998年研究者又研究了基于校验变量间的独立性关系来构建网络的基于独立性校验的结构学习算法,即Boundary DAG算法。由结构学习算法的发展过程可见都是以构建因果贝叶斯网络模型为目的,似的研究对象是专家预先处理过的数据集合,算法则是根据这些变量之间的统计特性来推断出他们之间的因果关系。因此也存在一些问题。如马尔可夫等价类问题、前提假设过强问题。
同一马尔可夫等价类表示同样的独立性关系的网络结构,在没有专家先验知识的情下,无法通过观测数据来区分,这样网络中有些边的方向无法确定。
在贝叶斯网络机构学习中,算法的许多假设在实际中无法满足。仅此需要寻求更一般情况下的学习。
后来的一些算法也对上述不足做了一定程度的改进。
4 贝叶斯网络的理论应用
4.1 基于贝叶斯的遗传算法
人工智能的研究中,对于遗传算法所涉及的信号处理、模糊模式识别、多目标优化。模糊优化、可靠性设计等较复杂结构,往往有成千上万个变量,变量之间又以不可预测的方式影响其他的变量,若每一个变量以一般的选择、交叉、变异的遗传操作,难以实现群体内个体结构重组的迭代。基于贝叶斯网络的遗传算法,以贝叶斯网络按概率传播方法,将群体(问题的解)一代一代地以优化和推理,并逐渐比较最优解。保持了遗传算法的优点,而为对不确定性命题进行推理和搜索,从而拓展了遗传算法。
4.2 基于贝叶斯网络的多目标优化问题
多目标优化算法的研究目前成为人工智能领域的研究热点,对该探索的技术主要集中在利用进化计算各种各样的求解方法。但是标准的进化机制有普遍的缺点:(1)必须设置参数,如交叉、变异和选择概率,并且需要选择适当的遗传算子,而参数的规范性设置和遗传算子的选择一直没有得到有效地解决。(2)简单的交叉、变异算子有较高的倾向去打破或丢失既不块,而保证积木块的适当成长和混合对进化的成功很重要;(3)将每个候选解看作一个独立的个体,即忽略了候选解之间的相互关系。正是由于这些原因,在多目标进化计算的求解过程中存在着无效进化和计算浪费。
应该注意到,通过一定的选择机制,筛选出候选解集体现着个体之间的本质联系,代表着遗传算法的进化方向和强度,影响着算法的有效性。因此在进化过程中重视这一部分优良解集的整体属性,充分挖掘器信息内涵,以便利用这些信息确保积木合适地积累。研究的方法可以在进化机制上进行创新,寻找能够详细刻画个体之间本质联系的有效工具进行种群学习。
将贝叶斯网络(Bayesian Networks,BN)和种群优良解集联系起来,种群中的寻优信息体现在贝叶斯网络上,网络的结构对应于个体编码之间的相互联系,网络的参数属性对应于个体编码之间的联系程度,根据这种联系程度进行知识推理以实现进化信息的遗传,这种方法称为基于图形模型的进化算法,它是基于概率模型的遗传算法的进一步发展,日益引起人们的重视。对当前贝叶斯网络的度量机制和搜索机制进行分析,理论总是需要为实践服务的,提出并应用一个新的贝叶斯多目标优化算法是一个极有意义的研究方向。
5 结论
随着贝叶斯网络的不断发展,以及其显著的优点,将很快成为人工智能领域不确定性推理和建模的一个有效工具。利用贝叶斯网络对事件或属性间带有不确定性的相互关系进行建模和推理在决策、实现特征融合、进行分类的数据分析领域得到广泛应用。如在医学诊断、故障诊断和预测等方面也都有很多成果。
参考文献
[1] 史会峰,谷根代.基于MCMC算法贝叶斯网络的学习[J] .华北电力大学学报 .2004,31(4):109~112.
[2] Dawid A P. Applications of a general propagation algorithm for probalilistic expert systems[J],Statistics and Computing,1992,2(2):25~26 .
[3] Buntine W L.Operations for learning with graphical models[J] .Journal of Artificial Intelligence Research,1994,2:159~225.
[4] Laurizen S I,Spiegelhalter D J.Local comoutations with probabilities on graphical structures and their application to expert systems[J],Journal of the Royal Statistical Sosiety,1988,50(2):157~224.
[5] Chickering D M.Learning Bayesian networks is NP-complete[A] .Learning from Data:AI and Statisitcs V[M] .New York:Springer,1996 .121~130 .
[6] Chow C K,Liu C N .Approximating discrete probability distributions with dependence trees[J] .IEEE Transaction on Information Theory,1968,IT-14(3):462~467 .
进化计算国际会议评价
演化计算&演化硬件相关会议评价zz2008-12-29 21:22写这个段子纯粹是因为受了南大周志华教授写的AI Conferences那篇文章的影响,于是起了写一篇自己心目中的conferences tier list的念头. List中的会议评价主要根据我本人这几年接触到的会议论文集的质量,参加会议过程的感观,CiteSeer的那个Estimated impact of publication venues in Computer Science,会议的专业性-我本人偏重evolvable hardware,大部分的LNCS出版的会议和少量的其他重要相关会议将做探讨.
grade A 的会议是演化硬件方向最权威的国际会议,其中发表的论文大多有一读的价值. B 的会议也是质量非常不错的会议, C 的会议, 我个人认为去发表论文还是不错的,但是通常不会在这个档次的会议proceeding中找参考论文.D 的会议,通常是一些发表论文数量超过500篇,或者永远在中国国内打转的"国际"会议.
--------------------------------------------------------------------------------
grade A:
ICES (A0): International Conference on Evolvable Systems: From Biology to Hareware. 基本上是从90年代evolvable hardware(EHW)的研究引起学术界的关注就开始举行的重要会议.EHW方面最重要的2个会议之一.首次举行是在1995年的瑞士(当时还不叫ICES,96年才开始现在的名字),基本上当时的EHW开创性人物瑞士的Eduardo Sanchez,日本的Tetsuya Higuchi等人都参加了这次会议.此会基本2年一次,2001年以后是单数年召开.奇怪的是2007,2008又连续在中国和捷克分别举行,实在是没什么规律.ICES的proceeding一直在LNCS上出版,每届发表的papers大概在40篇以下.例如ICES2005包括了21篇论文,而ICES2007发表了41篇,接受率33%.
AHS (A0): 以前叫NASA/DoD Conference on Evolvable hardware,2006开始改名为NASA/ESA Conference on Adaptive hardware and systems. 这个会议始于1999年,得到了美国NASA和国防部的资助,是EHW方面另外一个重要的会议. 每年一届.会议的proceeding由IEEE CS出版,基本和ICES一样,也就是那一帮堆搞演化硬件的权威人士出没的地方.最新一次的AHS 2007在AHS2006的基础上再次扩军,论文数由AHS 06的80篇左右涨到了过100篇.
*Impact factor (According to Citeseer 03):
ICES:0.72 (top 36.77%)
Average:0.72 (top 36.77%)
--------------------------------------------------------------------------------
grade B
PPSN (B+); International Conference on Parallel Problem Solving from Nature.始于1990的会议,主要议题是关于evolutionary computation and natural computing.Citeseer关于这个会议的评价颇高(impact factor 1.13, top 18.09%),和不少AI方面的tier 1的会议不相上下(例如: NIPS-impact factor 1.06, top 20.96%; SIGIR-.impact factor 1.10, top 19.08%), 放在grade A也有足够的理由,不过个人觉得这个会议发表的论文数量稍多(相对于他比较有限的范围),PPSN 2004超过了100篇大关,对是否能保证每一篇论文的质量,我有一点疑问,另外PPSN也不象ICES和AHS那样专注于EHW, 所以个人认为放在B+的位置是比较合理的.不过不管怎么说,此会都是该领域顶级的会议,出席会议的大牛也相当多,例如Xin Yao(Chief-in-editor of IEEE Trans. on Evolutionary Computation, FIEEE)教授就是此会议的常客.
EuroGP (B+):European Conference on Genetic Programming. 欧洲的关于Genetic Programming的会议.2006年的时候往这里投过论文,尽管比较遗憾的被它拒过,但是不得不说它真的很不错.Reviewer给的意见大多都很认真,中肯.对提高论文的质量是有帮助的.这几年我也投过10来个会议,包括EHW领域最权威的ICES,但是就审稿工作而言,EuroGP是最好的.
EMO (B0): International Conference on Evolutionary Multi-Criterion Optimization. 关于多目标演化优化的一个专业性会议.开始于2001年,第一届在瑞士举行,其后每2年举行一次,辗转于葡萄牙,墨西哥,2007的第四届EMO 在日本举行.会议审稿的认真程度可以认为在中等偏上,再参考CiteSeer的评价,放在B0的位置应该没有什么问题.
*Impact factor (According to Citeseer 03):
PPSN:1.13 (top 18.09%)
EuroGP:0.34 (top 61.50%)
EMO:0.30 (top 65.35%)
Average:0.59 (top 44.63%)
--------------------------------------------------------------------------------
grade C:
PRICAI (C+): Pacific Rim International Conference on Artificial Intelligence.环太平洋地区的区域性人工智能会议,90年始于日本,然后中,日,韩,澳大利亚,新西兰等国轮流举办. PRICAI覆盖整个AI领域,发表的论文数量: PRICAI'06大概full paper (oral) 100篇, short paper (post) 50篇.会议的审稿感觉很一般.
SEAL (C+): International Conference on Simulated Evolution and learning.亚太地区关于evolution & learning的会议.个人感觉前几界还不错,最近的SEAL2006有严重下降趋势,发表的论文大多来自国内,有向水会发展的趋势.论文的发表数量在120+.
MICAI (C+): Mexican International Conference on Artificial Intelligence.墨西哥AI大会.很有意思的一个会议,会议proceeding由LNAI(oral),IEEE CS(post)分别出版.MICAI 2006 oral的接受了123篇(447 submissions) post 55篇(167 submissions).会议的审稿还是比较严肃的,可以从中得到一些有益的评价.
ICANNGA (C+): International Conference on Adaptive and Natural Computing Algorithm. 一个关于适应性算法,仿生算法的会议.说来历史也很久远了,1993年在奥地利首次举行,以后每2年一次,主要是在欧洲晃荡. ICANNGA'07的论文发表数量为178篇(accept rate 38%).不过感觉会议的Reviewers并不是很认真,很难在他们给的评价中得到多少参考.具体会议情况嘛,参加过ICANNGA'07,感觉学术氛围还是很不错的,社会活动组织得也很有人情味,还是可以一去的会议.
IEA/AIE (C+): International Conference on Industrial,Engineering & Other Applications of Applied Intelligent Systems. 智能系统应用方面的一个会议.到2007年是第20界了,每年都开,全世界到处跑的会.2007年将在日本举行,462篇投稿,会议接受了125篇论文(包括了所有full, short or poster论文).偶中过一篇IEA/AIE2007 的full paper,总体感觉Reviewers的认真程度也就一般的吧.
AUS-AI (C+) : Australian Joint Conference on Artifical Intelligence.澳大利亚的区域性人工智能大会.给他们投过稿,感觉审稿的严肃性很一般.这个会的论文集历年来都是LNAI出的,接受率(2005: 77 full + 119 short papers/535 submissions =0.366; 2006: 89 full + 70 short papers/689 submissions = 0.231)较低,估计LNAI被ISI开除以后接受率会回升一下.
GECCO (C+): Genetic and Evolutionary Computation Conference. 一个世界范围的遗传算法研究的盛会,会议论文集由ACM出版.把它放在C+的位置主要是因为它偏高的会议接受率(acceptance ratio 2005: 253 oral + 120 post/ 549 submissions = 0.679;acceptance ratio 2006:205 oral + 143 post/ 446 submissions = 0.780),和超过300的论文发表数量,让人难免会对他的会议论文集的质量有些担心.
CEC (C+): IEEE Congress on Evolutionary Computation. IEEE举办的一个演化计算方面的盛会,每年举行一次.参考下近几年它的论文发表情况(acceptance ratio 2004: 174 accepted papers/ 460 submissions = 0.378; acceptance ratio 2005: 379 accepted papers/ 660 submissions = 0.574),有逐渐扩军的趋势,其中的论文难免鱼龙混杂,和GECCO放在一起还是适当的.CEC 07和CEC 08分别在新加坡和香港举行,不失为国内学生去参加国际性学术盛会的一个大好机会.
ICONIP (C0): International Conference on Neural Information Processing.亚太地区神经网络社团的盛会,每年在亚太各个角落召开.每年发论文的规模在300篇上下,接受率在40%左右.
IWANN (C0): International Work-Conference on Artificial and Natural Neural Networks.西班牙的一个关于神经网络的会议,历史也比较长了(1991).每单数年举行一次.虽然是一个偏重于神经网络的会议,但是近年来也涉足了其他bio-inspired方面的研究. IWANN 2005的论文接受情况为240 submissions, 150 accepted papers, 相当高的接受率! 放在C0的位置应该是合适的选择.
*Impact factor (According to Citeseer 03):
PRICAI:0.19 (top 76.33%)
SEAL:0.37 (top 58.96%)
MICAI:0.02 (top 96.56%)
IEA/AIE:0.09 (top 87.79%)
AUS-AI:0.16 (top 79.44%)
IWANN:0.16 (top 79.85%)
Average:0.16 (top 79.85%)
--------------------------------------------------------------------------------
grade D
KES (D+): International Conference on Knowledge-Based and Intelligent Information & Engineering Systems.KES的主题是关于computational intelligence,范围浩瀚,每年举行一次,KES2007已经是第11次了,世界各地到处跑.鉴于KES每年会议可怕的论文发表数量600+.和Citeseer不与评价的表现,放在D+是比较适当的.
ICCS (D+): International Conference on Computational Science. 又一个主题非常广阔的会议,包括一个主会,数十个workshops.其实也就其中某些workshop和Evolutionary Computing有关.此会世界范围内召开,论文发表情况(acceptance ratio 2006: 627 accepted papers/ 1400 submissions = 0.448; acceptance ratio 2006: 716 accepted papers/ 660 submissions = 0.298),近2年都在LNCS上出4大本论文集,除了规模大,基本没别的特点.
CIS (D0): International Conference on Computational Intelligence and Security. 看下会议的title就知道其范围的广泛,好象是从2005年开始每年举行一次,就在中国国内打转的会议.2005年首次举办就潮水般的有1800+的论文投稿,最后接受了337篇,编了厚厚的两本proceeding.当然比起后来风起云涌的一些国内灌水大会,CIS2005的300多篇实在算不上什么.
ISNN(D0): International Symposium on Neural Networks.国内的一个关于神经网络的会议.2004年第一界在大连举行,然后依次重庆,成都,南京各地的转.基本参加会议的以中国,韩国学术界人士为主.ISNN2006接受了616篇论文,接受率25%.(注:中国国内的会议普遍投稿数量巨大,超过3000篇的投稿也不少见,接受率低并不能说明这些中国会议的质量高)
ICIC (D0): International Conference on Intelligent Computing.又是国内近年来出现的有关智能计算的会议,每年一次.ICIC06的灌水规模大概submissions 3000+,接受700+,接受率23.4%的样子.
ICMLC (D0): International Conference on Machine Learning and Cybernetics.关于机器学习和控制的会议,投稿数量也是滔滔不决型的,2005年是2461篇投稿,接受了大概1000篇.LNAI出版了100来篇,不多说了.
ICNC-FSKD (D0): International Conference on Natural Computation & International Conference on Fuzzy Systems and Knowledge Discovery 还是一个不断在国内晃悠的会议,06年大概发400+文章,接受率13%的样子.
*Impact factor (According to Citeseer 03):
ICCS:0.05 (top 91.40%)
Average:0.05 (top 91.40%)
grade A 的会议是演化硬件方向最权威的国际会议,其中发表的论文大多有一读的价值. B 的会议也是质量非常不错的会议, C 的会议, 我个人认为去发表论文还是不错的,但是通常不会在这个档次的会议proceeding中找参考论文.D 的会议,通常是一些发表论文数量超过500篇,或者永远在中国国内打转的"国际"会议.
--------------------------------------------------------------------------------
grade A:
ICES (A0): International Conference on Evolvable Systems: From Biology to Hareware. 基本上是从90年代evolvable hardware(EHW)的研究引起学术界的关注就开始举行的重要会议.EHW方面最重要的2个会议之一.首次举行是在1995年的瑞士(当时还不叫ICES,96年才开始现在的名字),基本上当时的EHW开创性人物瑞士的Eduardo Sanchez,日本的Tetsuya Higuchi等人都参加了这次会议.此会基本2年一次,2001年以后是单数年召开.奇怪的是2007,2008又连续在中国和捷克分别举行,实在是没什么规律.ICES的proceeding一直在LNCS上出版,每届发表的papers大概在40篇以下.例如ICES2005包括了21篇论文,而ICES2007发表了41篇,接受率33%.
AHS (A0): 以前叫NASA/DoD Conference on Evolvable hardware,2006开始改名为NASA/ESA Conference on Adaptive hardware and systems. 这个会议始于1999年,得到了美国NASA和国防部的资助,是EHW方面另外一个重要的会议. 每年一届.会议的proceeding由IEEE CS出版,基本和ICES一样,也就是那一帮堆搞演化硬件的权威人士出没的地方.最新一次的AHS 2007在AHS2006的基础上再次扩军,论文数由AHS 06的80篇左右涨到了过100篇.
*Impact factor (According to Citeseer 03):
ICES:0.72 (top 36.77%)
Average:0.72 (top 36.77%)
--------------------------------------------------------------------------------
grade B
PPSN (B+); International Conference on Parallel Problem Solving from Nature.始于1990的会议,主要议题是关于evolutionary computation and natural computing.Citeseer关于这个会议的评价颇高(impact factor 1.13, top 18.09%),和不少AI方面的tier 1的会议不相上下(例如: NIPS-impact factor 1.06, top 20.96%; SIGIR-.impact factor 1.10, top 19.08%), 放在grade A也有足够的理由,不过个人觉得这个会议发表的论文数量稍多(相对于他比较有限的范围),PPSN 2004超过了100篇大关,对是否能保证每一篇论文的质量,我有一点疑问,另外PPSN也不象ICES和AHS那样专注于EHW, 所以个人认为放在B+的位置是比较合理的.不过不管怎么说,此会都是该领域顶级的会议,出席会议的大牛也相当多,例如Xin Yao(Chief-in-editor of IEEE Trans. on Evolutionary Computation, FIEEE)教授就是此会议的常客.
EuroGP (B+):European Conference on Genetic Programming. 欧洲的关于Genetic Programming的会议.2006年的时候往这里投过论文,尽管比较遗憾的被它拒过,但是不得不说它真的很不错.Reviewer给的意见大多都很认真,中肯.对提高论文的质量是有帮助的.这几年我也投过10来个会议,包括EHW领域最权威的ICES,但是就审稿工作而言,EuroGP是最好的.
EMO (B0): International Conference on Evolutionary Multi-Criterion Optimization. 关于多目标演化优化的一个专业性会议.开始于2001年,第一届在瑞士举行,其后每2年举行一次,辗转于葡萄牙,墨西哥,2007的第四届EMO 在日本举行.会议审稿的认真程度可以认为在中等偏上,再参考CiteSeer的评价,放在B0的位置应该没有什么问题.
*Impact factor (According to Citeseer 03):
PPSN:1.13 (top 18.09%)
EuroGP:0.34 (top 61.50%)
EMO:0.30 (top 65.35%)
Average:0.59 (top 44.63%)
--------------------------------------------------------------------------------
grade C:
PRICAI (C+): Pacific Rim International Conference on Artificial Intelligence.环太平洋地区的区域性人工智能会议,90年始于日本,然后中,日,韩,澳大利亚,新西兰等国轮流举办. PRICAI覆盖整个AI领域,发表的论文数量: PRICAI'06大概full paper (oral) 100篇, short paper (post) 50篇.会议的审稿感觉很一般.
SEAL (C+): International Conference on Simulated Evolution and learning.亚太地区关于evolution & learning的会议.个人感觉前几界还不错,最近的SEAL2006有严重下降趋势,发表的论文大多来自国内,有向水会发展的趋势.论文的发表数量在120+.
MICAI (C+): Mexican International Conference on Artificial Intelligence.墨西哥AI大会.很有意思的一个会议,会议proceeding由LNAI(oral),IEEE CS(post)分别出版.MICAI 2006 oral的接受了123篇(447 submissions) post 55篇(167 submissions).会议的审稿还是比较严肃的,可以从中得到一些有益的评价.
ICANNGA (C+): International Conference on Adaptive and Natural Computing Algorithm. 一个关于适应性算法,仿生算法的会议.说来历史也很久远了,1993年在奥地利首次举行,以后每2年一次,主要是在欧洲晃荡. ICANNGA'07的论文发表数量为178篇(accept rate 38%).不过感觉会议的Reviewers并不是很认真,很难在他们给的评价中得到多少参考.具体会议情况嘛,参加过ICANNGA'07,感觉学术氛围还是很不错的,社会活动组织得也很有人情味,还是可以一去的会议.
IEA/AIE (C+): International Conference on Industrial,Engineering & Other Applications of Applied Intelligent Systems. 智能系统应用方面的一个会议.到2007年是第20界了,每年都开,全世界到处跑的会.2007年将在日本举行,462篇投稿,会议接受了125篇论文(包括了所有full, short or poster论文).偶中过一篇IEA/AIE2007 的full paper,总体感觉Reviewers的认真程度也就一般的吧.
AUS-AI (C+) : Australian Joint Conference on Artifical Intelligence.澳大利亚的区域性人工智能大会.给他们投过稿,感觉审稿的严肃性很一般.这个会的论文集历年来都是LNAI出的,接受率(2005: 77 full + 119 short papers/535 submissions =0.366; 2006: 89 full + 70 short papers/689 submissions = 0.231)较低,估计LNAI被ISI开除以后接受率会回升一下.
GECCO (C+): Genetic and Evolutionary Computation Conference. 一个世界范围的遗传算法研究的盛会,会议论文集由ACM出版.把它放在C+的位置主要是因为它偏高的会议接受率(acceptance ratio 2005: 253 oral + 120 post/ 549 submissions = 0.679;acceptance ratio 2006:205 oral + 143 post/ 446 submissions = 0.780),和超过300的论文发表数量,让人难免会对他的会议论文集的质量有些担心.
CEC (C+): IEEE Congress on Evolutionary Computation. IEEE举办的一个演化计算方面的盛会,每年举行一次.参考下近几年它的论文发表情况(acceptance ratio 2004: 174 accepted papers/ 460 submissions = 0.378; acceptance ratio 2005: 379 accepted papers/ 660 submissions = 0.574),有逐渐扩军的趋势,其中的论文难免鱼龙混杂,和GECCO放在一起还是适当的.CEC 07和CEC 08分别在新加坡和香港举行,不失为国内学生去参加国际性学术盛会的一个大好机会.
ICONIP (C0): International Conference on Neural Information Processing.亚太地区神经网络社团的盛会,每年在亚太各个角落召开.每年发论文的规模在300篇上下,接受率在40%左右.
IWANN (C0): International Work-Conference on Artificial and Natural Neural Networks.西班牙的一个关于神经网络的会议,历史也比较长了(1991).每单数年举行一次.虽然是一个偏重于神经网络的会议,但是近年来也涉足了其他bio-inspired方面的研究. IWANN 2005的论文接受情况为240 submissions, 150 accepted papers, 相当高的接受率! 放在C0的位置应该是合适的选择.
*Impact factor (According to Citeseer 03):
PRICAI:0.19 (top 76.33%)
SEAL:0.37 (top 58.96%)
MICAI:0.02 (top 96.56%)
IEA/AIE:0.09 (top 87.79%)
AUS-AI:0.16 (top 79.44%)
IWANN:0.16 (top 79.85%)
Average:0.16 (top 79.85%)
--------------------------------------------------------------------------------
grade D
KES (D+): International Conference on Knowledge-Based and Intelligent Information & Engineering Systems.KES的主题是关于computational intelligence,范围浩瀚,每年举行一次,KES2007已经是第11次了,世界各地到处跑.鉴于KES每年会议可怕的论文发表数量600+.和Citeseer不与评价的表现,放在D+是比较适当的.
ICCS (D+): International Conference on Computational Science. 又一个主题非常广阔的会议,包括一个主会,数十个workshops.其实也就其中某些workshop和Evolutionary Computing有关.此会世界范围内召开,论文发表情况(acceptance ratio 2006: 627 accepted papers/ 1400 submissions = 0.448; acceptance ratio 2006: 716 accepted papers/ 660 submissions = 0.298),近2年都在LNCS上出4大本论文集,除了规模大,基本没别的特点.
CIS (D0): International Conference on Computational Intelligence and Security. 看下会议的title就知道其范围的广泛,好象是从2005年开始每年举行一次,就在中国国内打转的会议.2005年首次举办就潮水般的有1800+的论文投稿,最后接受了337篇,编了厚厚的两本proceeding.当然比起后来风起云涌的一些国内灌水大会,CIS2005的300多篇实在算不上什么.
ISNN(D0): International Symposium on Neural Networks.国内的一个关于神经网络的会议.2004年第一界在大连举行,然后依次重庆,成都,南京各地的转.基本参加会议的以中国,韩国学术界人士为主.ISNN2006接受了616篇论文,接受率25%.(注:中国国内的会议普遍投稿数量巨大,超过3000篇的投稿也不少见,接受率低并不能说明这些中国会议的质量高)
ICIC (D0): International Conference on Intelligent Computing.又是国内近年来出现的有关智能计算的会议,每年一次.ICIC06的灌水规模大概submissions 3000+,接受700+,接受率23.4%的样子.
ICMLC (D0): International Conference on Machine Learning and Cybernetics.关于机器学习和控制的会议,投稿数量也是滔滔不决型的,2005年是2461篇投稿,接受了大概1000篇.LNAI出版了100来篇,不多说了.
ICNC-FSKD (D0): International Conference on Natural Computation & International Conference on Fuzzy Systems and Knowledge Discovery 还是一个不断在国内晃悠的会议,06年大概发400+文章,接受率13%的样子.
*Impact factor (According to Citeseer 03):
ICCS:0.05 (top 91.40%)
Average:0.05 (top 91.40%)
SCI收录中国期刊及国际会议
SCI收录中国期刊及重要国际学术会议 收藏
2007年SCI收录中国期刊一览表
英文刊名 中文刊名 出版 收录库 国际标准期刊号
CHINESE JOURNAL OF ELECTRONICS 电子学报(英文版) TECHNOLOGY EHANGE LIMITED HONG KONG SCIE 1022-4653
CHINESE SCIENCE BULLETIN 科学通报 SCIENCE CHINA PRESS SCI CD SCIE 1001-6538
SCIENCE IN CHINA SERIES E-TECHNOLOGICAL SCIENCES 中国科学E—技术科学(英文版) SCIENCE CHINA PRESS SCI CD SCIE 1006-9321
ACTA MECHANICA SOLIDA SINICA 固体力学学报 ACTA MECHANICA SOLIDA SINICA SCIE 0894-9166
JOURNAL OF CENTRAL SOUTH UNIVERSITY OF TECHNOLOGY 中南工业大学学报(英文版) JOURNAL OF CENTRAL SOUTH UNIV TECHNOLOGY SCIE 1005-9784
JOURNAL OF COMPUTATIONAL MATHEMATICS 计算数学(英文版) VSP BV SCIE 0254-9409
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 计算机科学与技术学报(英文版) SCIENCE CHINA PRESS SCIE 1000-9000
JOURNAL OF UNIVERSITY OF SCIENCE AND TECHNOLOGY BEIJING 北京科技大学学报(英文版) JOURNAL OF UNIV OF SCIENCE AND TECHNOLOGY BEIJING SCIE 1005-8850
JOURNAL OF WUHAN UNIVERSITY OF TECHNOLOGY-MATERIALS SCIENCE EDITION 武汉理工大学学报(英文版) JOURNAL WUHAN UNIV TECHNOLOGY SCIE 1000-2413
PROGRESS IN NATURAL SCIENCE 自然科学进展 TAYLOR & FRANCIS LTD SCIE 1002-0071
SCIENCE IN CHINA SERIES F-INFORMATION SCIENCES 中国科学F—信息科学 SCIENCE CHINA PRESS SCIE 1009-2757
电子工程系 电子科学与技术一级学科 重要国际学术会议汇总
一、A类会议
序号 英文名称 (英文简称) 中文名称 备注
1. Conference on Optical fiber Communication (OFC) 光纤通信会议
2. IEEE Lasers and Electro-Optics Society Annual Meeting (CLEO) 激光和电光协会年会
3. Symposium on Information Display (SID) 显示会议
4. International Vacuum Conference (IVC) 国际真空会议
5. IEEE International Microwave Symposium (IMS) 国际微波会议 每年六月,美国
6. IEEE International Symposium on Antenna and Propagation (AP-S) 国际天线和传播会议 每年七月,美国
7. International Solid-State Circuits Conference (ISSCC) 国际固态电路年会
8. Asia and South Pacific Design Automation Conference (IEEE/ACM ASP-DAC) 亚洲及南太洋地区设计自动化年会
二、B类会议
序号 英文名称 (英文简称) 中文名称 备注
9. European Conference on Optical Communication (ECOC) 欧洲光通信会议
10. The Optoelectronics and Communications Conference (OECC) 光电子和通信会议
11. International Conference on Integrated Optics and Optical Fibre Communication (IOOC) 国际集成光学和光纤通信会议
12. International Symposium of America Vacuum Society (IVS) 美国真空学会国际会议
13. Conference on Lasers and Electro-Optics (CLEO)激光和电光会议
14. Conference on Optical Fier Sensor (OFS) 光纤传感会议
15. Asian Symposium on Information Display (ASID) 亚洲显示会议
16. International Symposium of infrared and millimeter wave (SPIE) 国际远红外与毫米波会议 每年,主要在美国
17. Asia Pacific Microwave Conference (APMC) 亚太微波会议 每年,主要在美国
18. IEEE International Symposium of Circuits and Systems (ISCAS) 国际电路与系统年会
19. European Solid-State Circuit Conference (ESSCIRC) 欧洲固态电路年会
20. IEEE International Conference on VLSI Design (ICVLSI) IEEE国际超大规模集成电路设计年会
21. Design, Automation and Test in Europe (DATE) 欧洲设计、自动化与测试年会
22. International Symposium on Low Power Electronics and Design (ISLPED) 国际低功耗电子学与设计年会
23. IEEE Annual Symposium on VLSI (ISVLSI) IEEE超大规模电路年会
注:
A类会议:本学科最顶尖级水平的国际会议;
B类会议:学术水平较高、组织工作成熟、按一定时间间隔系列性召开的国际会议。
http://www3.ntu.edu.sg/home/assourav/crank.htm
Computer Science Conference Rankings
DISCLAIMER:
The ranking of conferences are taken mostly from an informal external source. The detailed procedure behind the ranking is unknown to the author. These rankings do not necessarily represent my personal view either.
There is a possibility that some of the rankings may not be accurate, may not reflect current status of the conferences accurately, may not be complete, and there is no copyright. If you are looking for a scientifically accurate ranking please disregard this page. The author does not bear any responsibility if the ranking is inaccurate.
If you think this ranking offends you for some reason (e.g., you are not satisfied with the ranking of your conference), please just ignore the list. This is not an OFFICIAL list. This is ONLY FOR REFERENCE.
Some conferences accept multiple categories of papers. The rankings below are for the most prestigious category of paper at a given conference. All other categories should be treated as "unranked".
AREA: Databases
Rank 1:
SIGMOD: ACM SIGMOD Conf on Management of Data
PODS: ACM SIGMOD Conf on Principles of DB Systems
VLDB: Very Large Data Bases
ICDE: Intl Conf on Data Engineering
CIKM: Intl. Conf on Information and Knowledge Management
ICDT: Intl Conf on Database Theory
Rank 2:
SSD: Intl Symp on Large Spatial Databases
DEXA: Database and Expert System Applications
FODO: Intl Conf on Foundation on Data Organization
EDBT: Extending DB Technology
DOOD: Deductive and Object-Oriented Databases
DASFAA: Database Systems for Advanced Applications
SSDBM: Intl Conf on Scientific and Statistical DB Mgmt
CoopIS - Conference on Cooperative Information Systems
ER - Intl Conf on Conceptual Modeling (ER)
Rank 3:
COMAD: Intl Conf on Management of Data
BNCOD: British National Conference on Databases
ADC: Australasian Database Conference
ADBIS: Symposium on Advances in DB and Information Systems
DaWaK - Data Warehousing and Knowledge Discovery
RIDE Workshop
IFIP-DS: IFIP-DS Conference
IFIP-DBSEC - IFIP Workshop on Database Security
NGDB: Intl Symp on Next Generation DB Systems and Apps
ADTI: Intl Symp on Advanced DB Technologies and Integration
FEWFDB: Far East Workshop on Future DB Systems
MDM - Int. Conf. on Mobile Data Access/Management (MDA/MDM)
VDB - Visual Database Systems
IDEAS - International Database Engineering and Application Symposium
Others:
ARTDB - Active and Real-Time Database Systems
CODAS: Intl Symp on Cooperative DB Systems for Adv Apps
DBPL - Workshop on Database Programming Languages
EFIS/EFDBS - Engineering Federated Information (Database) Systems
KRDB - Knowledge Representation Meets Databases
NDB - National Database Conference (China)
NLDB - Applications of Natural Language to Data Bases
FQAS - Flexible Query-Answering Systems
IDC(W) - International Database Conference (HK CS)
RTDB - Workshop on Real-Time Databases
SBBD: Brazilian Symposium on Databases
WebDB - International Workshop on the Web and Databases
WAIM: Interational Conference on Web Age Information Management
DASWIS - Data Semantics in Web Information Systems
DMDW - Design and Management of Data Warehouses
DOLAP - International Workshop on Data Warehousing and OLAP
DMKD - Workshop on Research Issues in Data Mining and Knowledge Discovery
KDEX - Knowledge and Data Engineering Exchange Workshop
NRDM - Workshop on Network-Related Data Management
MobiDE - Workshop on Data Engineering for Wireless and Mobile Access
MDDS - Mobility in Databases and Distributed Systems
MEWS - Mining for Enhanced Web Search
TAKMA - Theory and Applications of Knowledge MAnagement
WIDM: International Workshop on Web Information and Data Management
W2GIS - International Workshop on Web and Wireless Geographical Information Systems
CDB - Constraint Databases and Applications
DTVE - Workshop on Database Technology for Virtual Enterprises
IWDOM - International Workshop on Distributed Object Management
OODBS - Workshop on Object-Oriented Database Systems
PDIS: Parallel and Distributed Information Systems
AREA: Artificial Intelligence and Related Subjects
Rank 1:
AAAI: American Association for AI National Conference
CVPR: IEEE Conf on Comp Vision and Pattern Recognition
IJCAI: Intl Joint Conf on AI
ICCV: Intl Conf on Computer Vision
ICML: Intl Conf on Machine Learning
KDD: Knowledge Discovery and Data Mining
KR: Intl Conf on Principles of KR & Reasoning
NIPS: Neural Information Processing Systems
UAI: Conference on Uncertainty in AI
AAMAS: Intl Conf on Autonomous Agents and Multi-Agent Systems (past: ICAA)
ACL: Annual Meeting of the ACL (Association of Computational Linguistics)
Rank 2:
NAACL: North American Chapter of the ACL
AID: Intl Conf on AI in Design
AI-ED: World Conference on AI in Education
CAIP: Inttl Conf on Comp. Analysis of Images and Patterns
CSSAC: Cognitive Science Society Annual Conference
ECCV: European Conference on Computer Vision
EAI: European Conf on AI
EML: European Conf on Machine Learning
GECCO: Genetic and Evolutionary Computation Conference (used to be GP)
IAAI: Innovative Applications in AI
ICIP: Intl Conf on Image Processing
ICNN/IJCNN: Intl (Joint) Conference on Neural Networks
ICPR: Intl Conf on Pattern Recognition
ICDAR: International Conference on Document Analysis and Recognition
ICTAI: IEEE conference on Tools with AI
AMAI: Artificial Intelligence and Maths
DAS: International Workshop on Document Analysis Systems
WACV: IEEE Workshop on Apps of Computer Vision
COLING: International Conference on Computational Liguistics
EMNLP: Empirical Methods in Natural Language Processing
EACL: Annual Meeting of European Association Computational Lingustics
CoNLL : Conference on Natural Language Learning
DocEng : ACM Symposium on Document Engineering
IEEE/WIC International Joint Conf on Web Intelligence and Intelligent Agent Technology
ICDM - IEEE International Conference on Data Mining
Rank 3:
PRICAI: Pacific Rim Intl Conf on AI
AAI: Australian National Conf on AI
ACCV: Asian Conference on Computer Vision
AI*IA: Congress of the Italian Assoc for AI
ANNIE: Artificial Neural Networks in Engineering
ANZIIS: Australian/NZ Conf on Intelligent Inf. Systems
CAIA: Conf on AI for Applications
CAAI: Canadian Artificial Intelligence Conference
ASADM: Chicago ASA Data Mining Conf: A Hard Look at DM
EPIA: Portuguese Conference on Artificial Intelligence
FCKAML: French Conf on Know. Acquisition & Machine Learning
ICANN: International Conf on Artificial Neural Networks
ICCB: International Conference on Case-Based Reasoning
ICGA: International Conference on Genetic Algorithms
ICONIP: Intl Conf on Neural Information Processing
IEA/AIE: Intl Conf on Ind. & Eng. Apps of AI & Expert Sys
ICMS: International Conference on Multiagent Systems
ICPS: International conference on Planning Systems
IWANN: Intl Work-Conf on Art & Natural Neural Networks
PACES: Pacific Asian Conference on Expert Systems
SCAI: Scandinavian Conference on Artifical Intelligence
SPICIS: Singapore Intl Conf on Intelligent System
PAKDD: Pacific-Asia Conf on Know. Discovery & Data Mining
SMC: IEEE Intl Conf on Systems, Man and Cybernetics
PAKDDM: Practical App of Knowledge Discovery & Data Mining
WCNN: The World Congress on Neural Networks
WCES: World Congress on Expert Systems
ASC: Intl Conf on AI and Soft Computing
PACLIC: Pacific Asia Conference on Language, Information and Computation
ICCC: International Conference on Chinese Computing
ICADL: International Conference on Asian Digital Libraries
RANLP: Recent Advances in Natural Language Processing
NLPRS: Natural Language Pacific Rim Symposium
Meta-Heuristics International Conference
Rank 3:
ICRA: IEEE Intl Conf on Robotics and Automation
NNSP: Neural Networks for Signal Processing
ICASSP: IEEE Intl Conf on Acoustics, Speech and SP
GCCCE: Global Chinese Conference on Computers in Education
ICAI: Intl Conf on Artificial Intelligence
AEN: IASTED Intl Conf on AI, Exp Sys & Neural Networks
WMSCI: World Multiconfs on Sys, Cybernetics & Informatics
LREC: Language Resources and Evaluation Conference
AIMSA: Artificial Intelligence: Methodology, Systems, Applications
AISC: Artificial Intelligence and Symbolic Computation
CIA: Cooperative Information Agents
International Conference on Computational Intelligence for Modelling , Control and Automation
Pattern Matching
ECAL: European Conference on Artificial Life
EKAW: Knowledge Acquisition, Modeling and Management
EMMCVPR: Energy Minimization Methods in Computer Vision and Pattern Recognition
EuroGP : European Conference on Genetic Programming
FoIKS : Foundations of Information and Knowledge Systems
IAWTIC: International Conference on Intelligent Agents, Web Technologies and Internet Commerce
ICAIL: International Conference on Artificial Intelligence and Law
SMIS: International Syposium on Methodologies for Intelligent Systems
IS&N: Intelligence and Services in Networks
JELIA: Logics in Artificial Intelligence
KI: German Conference on Artificial Intelligence
KRDB: Knowledge Representation Meets Databases
MAAMAW: Modelling Autonomous Agents in a Multi-Agent World
NC: ICSC Symposium on Neural Computation
PKDD: Principles of Data Mining and Knowledge Discovery
SBIA: Brazilian Symposium on Artificial Intelligence
Scale-Space: Scale-Space Theories in Computer Vision
XPS: Knowledge-Based Systems
I2CS: Innovative Internet Computing Systems
TARK: Theoretical Aspects of Rationality and Knowledge Meeting
MKM: International Workshop on Mathematical Knowledge Management
ACIVS: International Conference on Advanced Concepts For Intelligent Vision Systems
ATAL: Agent Theories, Architectures, and Languages
LACL: International Conference on Logical Aspects of Computational Linguistics
AREA: Hardware and Architecture
Rank 1:
ASPLOS: Architectural Support for Prog Lang and OS
ISCA: ACM/IEEE Symp on Computer Architecture
ICCAD: Intl Conf on Computer-Aided Design
DAC: Design Automation Conf
MICRO: Intl Symp on Microarchitecture
HPCA: IEEE Symp on High-Perf Comp Architecture
Rank 2:
FCCM: IEEE Symposium on Field Programmable Custom Computing Machines
SUPER: ACM/IEEE Supercomputing Conference
ICS: Intl Conf on Supercomputing
ISSCC: IEEE Intl Solid-State Circuits Conf
HCS: Hot Chips Symp
VLSI: IEEE Symp VLSI Circuits
CODES+ISSS: Intl Conf on Hardware/Software Codesign & System Synthesis
DATE: IEEE/ACM Design, Automation & Test in Europe Conference
FPL: Field-Programmable Logic and Applications
CASES: International Conference on Compilers, Architecture, and Synthesis for Embedded Systems
Rank 3:
ICA3PP: Algs and Archs for Parall Proc
EuroMICRO : New Frontiers of Information Technology
ACS: Australian Supercomputing Conf
ISC: Information Security Conference
Unranked:
Advanced Research in VLSI
International Symposium on System Synthesis
International Symposium on Computer Design
International Symposium on Circuits and Systems
Asia Pacific Design Automation Conference
International Symposium on Physical Design
International Conference on VLSI Design
CANPC: Communication, Architecture, and Applications for Network-Based Parallel Computing
CHARME: Conference on Correct Hardware Design and Verification Methods
CHES: Cryptographic Hardware and Embedded Systems
NDSS: Network and Distributed System Security Symposium
NOSA: Nordic Symposium on Software Architecture
ACAC: Australasian Computer Architecture Conference
CSCC: WSES/IEEE world multiconference on Circuits, Systems, Communications & Computers
ICN: IEEE International Conference on Networking Topology in Computer Science Conference
AREA: Applications and Media
Rank 1:
I3DG: ACM-SIGRAPH Interactive 3D Graphics
SIGGRAPH: ACM SIGGRAPH Conference
ACM-MM: ACM Multimedia Conference
DCC: Data Compression Conf
SIGMETRICS: ACM Conf on Meas. & Modelling of Comp Sys
SIGIR: ACM SIGIR Conf on Information Retrieval
PECCS: IFIP Intl Conf on Perf Eval of Comp \& Comm Sys
WWW: World-Wide Web Conference
Rank 2:
IEEE Visualization
EUROGRAPH: European Graphics Conference
CGI: Computer Graphics International
CANIM: Computer Animation
PG: Pacific Graphics
ICME: Intl Conf on MMedia & Expo
NOSSDAV: Network and OS Support for Digital A/V
PADS: ACM/IEEE/SCS Workshop on Parallel \& Dist Simulation
WSC: Winter Simulation Conference
ASS: IEEE Annual Simulation Symposium
MASCOTS: Symp Model Analysis \& Sim of Comp \& Telecom Sys
PT: Perf Tools - Intl Conf on Model Tech \& Tools for CPE
NetStore : Network Storage Symposium
MMCN: ACM/SPIE Multimedia Computing and Networking
JCDL: Joint Conference on Digital Libraries
Rank 3:
ACM-HPC: ACM Hypertext Conf
MMM: Multimedia Modelling
DSS: Distributed Simulation Symposium
SCSC: Summer Computer Simulation Conference
WCSS: World Congress on Systems Simulation
ESS: European Simulation Symposium
ESM: European Simulation Multiconference
HPCN: High-Performance Computing and Networking
Geometry Modeling and Processing
WISE
DS-RT: Distributed Simulation and Real-time Applications
IEEE Intl Wshop on Dist Int Simul and Real-Time Applications
ECIR: European Colloquium on Information Retrieval
Ed-Media
IMSA: Intl Conf on Internet and MMedia Sys
Un-ranked:
DVAT: IS\&T/SPIE Conf on Dig Video Compression Alg \& Tech
MME: IEEE Intl Conf. on Multimedia in Education
ICMSO: Intl Conf on Modelling , Simulation and Optimisation
ICMS: IASTED Intl Conf on Modelling and Simulation
COTIM: Conference on Telecommunications and Information Markets
DOA: International Symposium on Distributed Objects and Applications
ECMAST: European Conference on Multimedia Applications, Services and Techniques
GIS: Workshop on Advances in Geographic Information Systems
IDA: Intelligent Data Analysis
IDMS: Interactive Distributed Multimedia Systems and Telecommunication Services
IUI: Intelligent User Interfaces
MIS: Workshop on Multimedia Information Systems
WECWIS: Workshop on Advanced Issues of E-Commerce and Web/based Information Systems
WIDM: Web Information and Data Management
WOWMOM: Workshop on Wireless Mobile Multimedia
WSCG: International Conference in Central Europe on Computer Graphics and Visualization
LDTA: Workshop on Language Descriptions, Tools and Applications
IPDPSWPIM: International Workshop on Parallel and Distributed Computing Issues in Wireless Networks and Mobile Computing
IWST: International Workshop on Scheduling and Telecommunications
APDCM: Workshop on Advances in Parallel and Distributed Computational Models
CIMA: International ICSC Congress on Computational Intelligence: Methods and Applications
FLA: Fuzzy Logic and Applications Meeting
ICACSD: International Conference on Application of Concurrency to System Design
ICATPN: International conference on application and theory of Petri nets
AICCSA: ACS International Conference on Computer Systems and Applications
CAGD: International Symposium of Computer Aided Geometric Design
Spanish Symposium on Pattern Recognition and Image Analysis
International Workshop on Cluster Infrastructure for Web Server and E-Commerce Applications
WSES ISA: Information Science And Applications Conference
CHT: International Symposium on Advances in Computational Heat Transfer
IMACS: International Conference on Applications of Computer Algebra
VIPromCom : International Symposium on Video Processing and Multimedia Communications
PDMPR: International Workshop on Parallel and Distributed Multimedia Processing & Retrieval
International Symposium On Computational And Applied Pdes
PDCAT: International Conference on Parallel and Distributed Computing, Applications, and Techniques
Biennial Computational Techniques and Applications Conference
Symposium on Advanced Computing in Financial Markets
WCCE: World Conference on Computers in Education
ITCOM: SPIE's International Symposium on The Convergence of Information Technologies and Communications
Conference on Commercial Applications for High-Performance Computing
MSA: Metacomputing Systems and Applications Workshop
WPMC : International Symposium on Wireless Personal Multimedia Communications
WSC: Online World Conference on Soft Computing in Industrial Applications
HERCMA: Hellenic European Research on Computer Mathematics and its Applications
PARA: Workshop on Applied Parallel Computing
International Computer Science Conference: Active Media Technology
IW-MMDBMS - Int. Workshop on Multi-Media Data Base Management Systems
AREA: System Technology
Rank 1:
SIGCOMM: ACM Conf on Comm Architectures, Protocols & Apps
INFOCOM: Annual Joint Conf IEEE Comp & Comm Soc
SPAA: Symp on Parallel Algms and Architecture
PODC: ACM Symp on Principles of Distributed Computing
PPoPP : Principles and Practice of Parallel Programming
RTSS: Real Time Systems Symp
SOSP: ACM SIGOPS Symp on OS Principles
SOSDI: Usenix Symp on OS Design and Implementation
CCS: ACM Conf on Comp and Communications Security
IEEE Symposium on Security and Privacy
MOBICOM: ACM Intl Conf on Mobile Computing and Networking
USENIX Conf on Internet Tech and Sys
ICNP: Intl Conf on Network Protocols
PACT: Intl Conf on Parallel Arch and Compil Tech
RTAS: IEEE Real-Time and Embedded Technology and Applications Symposium
ICDCS: IEEE Intl Conf on Distributed Comp Systems
Rank 2:
CC: Compiler Construction
IPDPS: Intl Parallel and Dist Processing Symp
IC3N: Intl Conf on Comp Comm and Networks
ICPP: Intl Conf on Parallel Processing
SRDS: Symp on Reliable Distributed Systems
MPPOI: Massively Par Proc Using Opt Interconns
ASAP: Intl Conf on Apps for Specific Array Processors
Euro-Par: European Conf. on Parallel Computing
Fast Software Encryption
Usenix Security Symposium
European Symposium on Research in Computer Security
WCW: Web Caching Workshop
LCN: IEEE Annual Conference on Local Computer Networks
IPCCC: IEEE Intl Phoenix Conf on Comp & Communications
CCC: Cluster Computing Conference
ICC: Intl Conf on Comm
WCNC: IEEE Wireless Communications and Networking Conference
CSFW: IEEE Computer Security Foundations Workshop
Rank 3:
MPCS: Intl. Conf. on Massively Parallel Computing Systems
GLOBECOM: Global Comm
ICCC: Intl Conf on Comp Communication
NOMS: IEEE Network Operations and Management Symp
CONPAR: Intl Conf on Vector and Parallel Processing
VAPP: Vector and Parallel Processing
ICPADS: Intl Conf. on Parallel and Distributed Systems
Public Key Cryptosystems
Annual Workshop on Selected Areas in Cryptography
Australasia Conference on Information Security and Privacy
Int. Conf on Inofrm and Comm. Security
Financial Cryptography
Workshop on Information Hiding
Smart Card Research and Advanced Application Conference
ICON: Intl Conf on Networks
NCC: Nat Conf Comm
IN: IEEE Intell Network Workshop
Softcomm : Conf on Software in Tcomms and Comp Networks
INET: Internet Society Conf
Workshop on Security and Privacy in E-commerce
Un-ranked:
PARCO: Parallel Computing
SE: Intl Conf on Systems Engineering (**)
PDSECA: workshop on Parallel and Distributed Scientific and Engineering Computing with Applications
CACS: Computer Audit, Control and Security Conference
SREIS: Symposium on Requirements Engineering for Information Security
SAFECOMP: International Conference on Computer Safety, Reliability and Security
IREJVM: Workshop on Intermediate Representation Engineering for the Java Virtual Machine
EC: ACM Conference on Electronic Commerce
EWSPT: European Workshop on Software Process Technology
HotOS : Workshop on Hot Topics in Operating Systems
HPTS: High Performance Transaction Systems
Hybrid Systems
ICEIS: International Conference on Enterprise Information Systems
IOPADS: I/O in Parallel and Distributed Systems
IRREGULAR: Workshop on Parallel Algorithms for Irregularly Structured Problems
KiVS : Kommunikation in Verteilten Systemen
LCR: Languages, Compilers, and Run-Time Systems for Scalable Computers
MCS: Multiple Classifier Systems
MSS: Symposium on Mass Storage Systems
NGITS: Next Generation Information Technologies and Systems
OOIS: Object Oriented Information Systems
SCM: System Configuration Management
Security Protocols Workshop
SIGOPS European Workshop
SPDP: Symposium on Parallel and Distributed Processing
TreDS : Trends in Distributed Systems
USENIX Technical Conference
VISUAL: Visual Information and Information Systems
FoDS : Foundations of Distributed Systems: Design and Verification of Protocols conference
RV: Post-CAV Workshop on Runtime Verification
ICAIS: International ICSC-NAISO Congress on Autonomous Intelligent Systems
ITiCSE : Conference on Integrating Technology into Computer Science Education
CSCS: CyberSystems and Computer Science Conference
AUIC: Australasian User Interface Conference
ITI: Meeting of Researchers in Computer Science, Information Systems Research & Statistics
European Conference on Parallel Processing
RODLICS: Wses International Conference on Robotics, Distance Learning & Intelligent Communication Systems
International Conference On Multimedia, Internet & Video Technologies
PaCT : Parallel Computing Technologies workshop
PPAM: International Conference on Parallel Processing and Applied Mathematics
International Conference On Information Networks, Systems And Technologies
AmiRE : Conference on Autonomous Minirobots for Research and Edutainment
DSN: The International Conference on Dependable Systems and Networks
IHW: Information Hiding Workshop
GTVMT: International Workshop on Graph Transformation and Visual Modeling Techniques
AREA: Programming Languages and Software Engineering
Rank 1:
POPL: ACM-SIGACT Symp on Principles of Prog Langs
PLDI: ACM-SIGPLAN Symp on Prog Lang Design & Impl
OOPSLA: OO Prog Systems, Langs and Applications
ICFP: Intl Conf on Function Programming
JICSLP/ICLP/ILPS: (Joint) Intl Conf/Symp on Logic Prog
ICSE: Intl Conf on Software Engineering
FSE: ACM Conf on the Foundations of Software Engineering (inc: ESEC-FSE)
FM/FME: Formal Methods, World Congress/Europe
CAV: Computer Aided Verification
Rank 2:
CP: Intl Conf on Principles & Practice of Constraint Prog
TACAS: Tools and Algos for the Const and An of Systems
ESOP: European Conf on Programming
ICCL: IEEE Intl Conf on Computer Languages
PEPM: Symp on Partial Evalutation and Prog Manipulation
SAS: Static Analysis Symposium
RTA: Rewriting Techniques and Applications
IWSSD: Intl Workshop on S/W Spec & Design
CAiSE : Intl Conf on Advanced Info System Engineering
SSR: ACM SIGSOFT Working Conf on Software Reusability
SEKE: Intl Conf on S/E and Knowledge Engineering
ICSR: IEEE Intl Conf on Software Reuse
ASE: Automated Software Engineering Conference
PADL: Practical Aspects of Declarative Languages
ISRE: Requirements Engineering
ICECCS: IEEE Intl Conf on Eng. of Complex Computer Systems
IEEE Intl Conf on Formal Engineering Methods
Intl Conf on Integrated Formal Methods
FOSSACS: Foundations of Software Science and Comp Struct
APLAS: Asian Symposium on Programming Languages and Systems
MPC: Mathematics of Program Construction
ECOOP: European Conference on Object-Oriented Programming
ICSM: Intl. Conf on Software Maintenance
HASKELL - Haskell Workshop
Rank 3:
FASE: Fund Appr to Soft Eng
APSEC: Asia-Pacific S/E Conf
PAP/PACT: Practical Aspects of PROLOG/Constraint Tech
ALP: Intl Conf on Algebraic and Logic Programming
PLILP: Prog , Lang Implentation & Logic Programming
LOPSTR: Intl Workshop on Logic Prog Synthesis & Transf
ICCC: Intl Conf on Compiler Construction
COMPSAC: Intl. Computer S/W and Applications Conf
TAPSOFT: Intl Joint Conf on Theory & Pract of S/W Dev
WCRE: SIGSOFT Working Conf on Reverse Engineering
AQSDT: Symp on Assessment of Quality S/W Dev Tools
IFIP Intl Conf on Open Distributed Processing
Intl Conf of Z Users
IFIP Joint Int'l Conference on Formal Description Techniques and Protocol Specification, Testing, And Verification
PSI (Ershov conference)
UML: International Conference on the Unified Modeling Language
Un-ranked:
Australian Software Engineering Conference
IEEE Int. W'shop on Object-oriented Real-time Dependable Sys. (WORDS)
IEEE International Symposium on High Assurance Systems Engineering
The Northern Formal Methods Workshops
Formal Methods Pacific
Int. Workshop on Formal Methods for Industrial Critical Systems
JFPLC - International French Speaking Conference on Logic and Constraint Programming
L&L - Workshop on Logic and Learning
SFP - Scottish Functional Programming Workshop
LCCS - International Workshop on Logic and Complexity in Computer Science
VLFM - Visual Languages and Formal Methods
NASA LaRC Formal Methods Workshop
PASTE: Workshop on Program Analysis For Software Tools and Engineering
TLCA: Typed Lambda Calculus and Applications
FATES - A Satellite workshop on Formal Approaches to Testing of Software
Workshop On Java For High-Performance Computing
DSLSE - Domain-Specific Languages for Software Engineering
FTJP - Workshop on Formal Techniques for Java Programs
WFLP - International Workshop on Functional and (Constraint) Logic Programming
FOOL - International Workshop on Foundations of Object-Oriented Languages
SREIS - Symposium on Requirements Engineering for Information Security
HLPP - International workshop on High-level parallel programming and applications
INAP - International Conference on Applications of Prolog
MPOOL - Workshop on Multiparadigm Programming with OO Languages
PADO - Symposium on Programs as Data Objects
TOOLS: Int'l Conf Technology of Object-Oriented Languages and Systems
Australasian Conference on Parallel And Real-Time Systems
PASTE: Workshop on Program Analysis For Software Tools and Engineering
AvoCS : Workshop on Automated Verification of Critical Systems
SPIN: Workshop on Model Checking of Software
FemSys : Workshop on Formal Design of Safety Critical Embedded Systems
Ada -Europe
PPDP: Principles and Practice of Declarative Programming
APL Conference
ASM: Workshops on Abstract State Machines
COORDINATION: Coordination Models and Languages
DocEng : ACM Symposium on Document Engineering
DSV-IS: Design, Specification, and Verification of Interactive Systems
FMCAD: Formal Methods in Computer-Aided Design
FMLDO: Workshop on Foundations of Models and Languages for Data and Objects
IFL: Implementation of Functional Languages
ILP: International Workshop on Inductive Logic Programming
ISSTA: International Symposium on Software Testing and Analysis
ITC: International Test Conference
IWFM: Irish Workshop in Formal Methods
Java Grande
LP: Logic Programming: Japanese Conference
LPAR: Logic Programming and Automated Reasoning
LPE: Workshop on Logic Programming Environments
LPNMR: Logic Programming and Non-monotonic Reasoning
PJW: Workshop on Persistence and Java
RCLP: Russian Conference on Logic Programming
STEP: Software Technology and Engineering Practice
TestCom : IFIP International Conference on Testing of Communicating Systems
VL: Visual Languages
FMPPTA: Workshop on Formal Methods for Parallel Programming Theory and Applications
WRS: International Workshop on Reduction Strategies in Rewriting and Programming
FATES: A Satellite workshop on Formal Approaches to Testing of Software
FORMALWARE: Meeting on Formalware Engineering: Formal Methods for Engineering Software
DRE: conference Data Reverse Engineering
STAREAST: Software Testing Analysis & Review Conference
Conference on Applied Mathematics and Scientific Computing
International Testing Computer Software Conference
Linux Showcase & Conference
FLOPS: International Symposum on Functional and Logic Programming
GCSE: International Conference on Generative and Component-Based Software Engineering
JOSES: Java Optimization Strategies for Embedded Systems
AADEBUG: Automated and Algorithmic Debugging
AMAST: Algebraic Methodology and Software Technology
AREA: Algorithms and Theory
Rank 1:
STOC: ACM Symp on Theory of Computing
FOCS: IEEE Symp on Foundations of Computer Science
COLT: Computational Learning Theory
LICS: IEEE Symp on Logic in Computer Science
SCG: ACM Symp on Computational Geometry
SODA: ACM/SIAM Symp on Discrete Algorithms
SPAA: ACM Symp on Parallel Algorithms and Architectures
ISSAC: Intl. Symp on Symbolic and Algebraic Computation
CRYPTO: Advances in Cryptology
Rank 2:
EUROCRYPT: European Conf on Cryptography
CONCUR: International Conference on Concurrency Theory
ICALP: Intl Colloquium on Automata, Languages and Prog
STACS: Symp on Theoretical Aspects of Computer Science
CC: IEEE Symp on Computational Complexity
WADS: Workshop on Algorithms and Data Structures
MFCS: Mathematical Foundations of Computer Science
SWAT: Scandinavian Workshop on Algorithm Theory
ESA: European Symp on Algorithms
IPCO: MPS Conf on integer programming & comb optimization
LFCS: Logical Foundations of Computer Science
ALT: Algorithmic Learning Theory
EUROCOLT: European Conf on Learning Theory
DSIC: Int'l Symp om Distributed Computing (formally WDAG: Workshop on Distributed Algorithms)
ISTCS: Israel Symp on Theory of Computing and Systems
ISAAC: Intl Symp on Algorithms and Computation
FST&TCS: Foundations of S/W Tech & Theoretical CS
LATIN: Intl Symp on Latin American Theoretical Informatics
CADE: Conf on Automated Deduction
IEEEIT: IEEE Symposium on Information Theory
Asiacrypt
Rank 3:
MEGA: Methods Effectives en Geometrie Algebrique
ASIAN: Asian Computing Science Conf
CCCG: Canadian Conf on Computational Geometry
FCT: Fundamentals of Computation Theory
WG: Workshop on Graph Theory
CIAC: Italian Conf on Algorithms and Complexity
ICCI: Advances in Computing and Information
AWTI: Argentine Workshop on Theoretical Informatics
CATS: The Australian Theory Symp
COCOON: Annual Intl Computing and Combinatorics Conf
UMC: Unconventional Models of Computation
MCU: Universal Machines and Computations
GD: Graph Drawing
SIROCCO: Structural Info & Communication Complexity
ALEX: Algorithms and Experiments
ALG: ENGG Workshop on Algorithm Engineering
LPMA: Intl Workshop on Logic Programming and Multi-Agents
EWLR: European Workshop on Learning Robots
CITB: Complexity & info-theoretic approaches to biology
FTP: Intl Workshop on First-Order Theorem Proving (FTP)
CSL: Annual Conf on Computer Science Logic (CSL)
AAAAECC: Conf On Applied Algebra, Algebraic Algms & ECC
DMTCS: Intl Conf on Disc Math and TCS
JCDCG: Japan Conference on Discrete and Computational Geometry
Un-ranked:
Information Theory Workshop
CL: International Conference on Computational Logic
COSIT: Spatial Information Theory
ETAPS: European joint conference on Theory And Practice of Software
ICCS: International Conference on Conceptual Structures
ICISC: Information Security and Cryptology
PPSN: Parallel Problem Solving from Nature
SOFSEM: Conference on Current Trends in Theory and Practice of Informatics
TPHOLs : Theorem Proving in Higher Order Logics
WADT: Workshop on Algebraic Development Techniques
TERM: THEMATIC TERM: Semigroups , Algorithms, Automata and Languages
IMGTA: Italian Meeting on Game Theory and Applications
DLT: Developments in Language Theory
International Workshop on Discrete Algorithms and Methods for Mobile Computing and Communications
APPROX: International Workshop on Approximation Algorithms for Combinatorial Optimization Problems
WAE: Workshop on Algorithm Engineering
CMFT: Computational Methods and Function Theory
AWOCA: Australasian Workshop on Combinatorial Algorithms
Fun with Algorithms Meeting
ICTCS: Italian Conference on Theoretical Computer Science
ComMaC : International Conference On Computational Mathematics
TLCA: Typed Lambda Calculus and Applications
DCAGRS: Workshop on Descriptional Complexity of Automata, Grammars and Related Structures
AREA: Biomedical
Rank 1:
RECOMB: Annual Intl Conf on Comp Molecular Biology
ISMB: International Conference on Intelligent Systems for Molecular Biology
Rank 2:
AMIA: American Medical Informatics Annual Fall Symposium
DNA: Meeting on DNA Based Computers
WABI: Workshop on Algorithms in Bioinformatics
Rank 3:
MEDINFO: World Congress on Medical Informatics
International Conference on Sequences and their Applications
ECAIM: European Conf on AI in Medicine
APAMI: Asia Pacific Assoc for Medical Informatics Conf
INBS: IEEE Intl Symp on Intell . in Neural & Bio Systems
Un-ranked:
MCBC: Wses conf on Mathematics And Computers In Biology And Chemistry
KDDMBD - Knowledge Discovery and Data Mining in Biological Databases Meeting
AREA: Miscellaneous
Rank 1:
Rank 2:
CSCW: Conference on Computer Supported Cooperative Work (*)
Rank 3:
SAC: ACM/SIGAPP Symposium on Applied Computing
ICSC: Internal Computer Science Conference
ISCIS: Intl Symp on Computer and Information Sciences
ICSC2: International Computer Symposium Conference
ICCE: Intl Conf on Comps in Edu
WCC: World Computing Congress
PATAT: Practice and Theory of Automated Timetabling
Unranked:
ICCI: International Conference on Cognitive Informatics
APISIT: Asia Pacific International Symposium on Information Technology
CW: The International Conference on Cyberworlds
Workshop on Open Hypermedia Systems
Workshop on Middleware for Mobile Computing
International Working Conference on Distributed Applications and Interoperable Systems
ADL: Advances in Digital Libraries
ADT: Specification of Abstract Data Type Workshops
AVI: Working Conference on Advanced Visual Interfaces
DL: Digital Libraries
DLog : Description Logics
ECDL: European Conference on Digital Libraries
EDCC: European Dependable Computing Conference
FroCos : Frontiers of Combining Systems
FTCS: Symposium on Fault-Tolerant Computing
IFIP World Computer Congress
INTEROP: Interoperating Geographic Information Systems
IO: Information Outlook
IQ: MIT Conference on Information Quality
IUC: International Unicode Conference
IWMM: International Workshop on Memory Management
MD: IEEE Meta-Data Conference
Middleware
MLDM: Machine Learning and Data Mining in Pattern Recognition
POS: Workshop on Persistent Object Systems
SCCC: International Conference of the Chilean Computer Science Society
SPIRE: String Processing and Information Retrieval
TABLEAUX: Analytic Tableaux and Related Methods
TIME Workshops
TREC: Text REtrieval Conference
UIDIS: User Interfaces to Data Intensive Systems
VRML Conference
AFIPS: American Federation of Information Processing Societies
ACSC: Australasian Computer Science Conference
CMCS: Coalgebraic Methods in Computer Science
BCTCS: British Colloquium for Theoretical Computer Science
IJCAR: The International Joint Conference on Automated Reasoning
STRATEGIES: International Workshop on Strategies in Automated Deduction
UNIF: International Workshop on Unification
SOCO: Meeting on Soft Computing
ConCoord : International Workshop on Concurrency and Coordination
CIAA: International Conference on Implementation and Application of Automata
Workshop on Information Stucture , Discourse Structure and Discourse Semantics
RANDOM: International Workshop on Randomization and Approximation Techniques in Computer Science
WMC: Workshop on Membrane Computing
FI-CS: Fixed Points in Computer Science
DC Computer Science Conference
Workshop on Novel Approaches to Hard Discrete Optimization
NALAC: Numerical Analysis, Linear Algebra And Computations Conference
ICLSSC: International Conference on Large-Scale Scientific Computations
ISACA : Information Systems Audit and Control Association International Conference
ICOSAHOM: International Conference On Spectral And High Order Methods
AIP: International Conference on Applied Inverse Problems: Theoretical and Computational Aspects
ECCM: European Conference On Computational Mechanics
Scicade : Scientific Computing and Differential Equation
BMVC: British Machine Vision Conference
COMEP: Euroconference On Computational Mechanics And Engineering Practis
JCIS: Joint Conference on Information Sciences
CHP: Compilers for High Performance conference
SIAM Conference on Geometric Design and Computing
2007年SCI收录中国期刊一览表
英文刊名 中文刊名 出版 收录库 国际标准期刊号
CHINESE JOURNAL OF ELECTRONICS 电子学报(英文版) TECHNOLOGY EHANGE LIMITED HONG KONG SCIE 1022-4653
CHINESE SCIENCE BULLETIN 科学通报 SCIENCE CHINA PRESS SCI CD SCIE 1001-6538
SCIENCE IN CHINA SERIES E-TECHNOLOGICAL SCIENCES 中国科学E—技术科学(英文版) SCIENCE CHINA PRESS SCI CD SCIE 1006-9321
ACTA MECHANICA SOLIDA SINICA 固体力学学报 ACTA MECHANICA SOLIDA SINICA SCIE 0894-9166
JOURNAL OF CENTRAL SOUTH UNIVERSITY OF TECHNOLOGY 中南工业大学学报(英文版) JOURNAL OF CENTRAL SOUTH UNIV TECHNOLOGY SCIE 1005-9784
JOURNAL OF COMPUTATIONAL MATHEMATICS 计算数学(英文版) VSP BV SCIE 0254-9409
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 计算机科学与技术学报(英文版) SCIENCE CHINA PRESS SCIE 1000-9000
JOURNAL OF UNIVERSITY OF SCIENCE AND TECHNOLOGY BEIJING 北京科技大学学报(英文版) JOURNAL OF UNIV OF SCIENCE AND TECHNOLOGY BEIJING SCIE 1005-8850
JOURNAL OF WUHAN UNIVERSITY OF TECHNOLOGY-MATERIALS SCIENCE EDITION 武汉理工大学学报(英文版) JOURNAL WUHAN UNIV TECHNOLOGY SCIE 1000-2413
PROGRESS IN NATURAL SCIENCE 自然科学进展 TAYLOR & FRANCIS LTD SCIE 1002-0071
SCIENCE IN CHINA SERIES F-INFORMATION SCIENCES 中国科学F—信息科学 SCIENCE CHINA PRESS SCIE 1009-2757
电子工程系 电子科学与技术一级学科 重要国际学术会议汇总
一、A类会议
序号 英文名称 (英文简称) 中文名称 备注
1. Conference on Optical fiber Communication (OFC) 光纤通信会议
2. IEEE Lasers and Electro-Optics Society Annual Meeting (CLEO) 激光和电光协会年会
3. Symposium on Information Display (SID) 显示会议
4. International Vacuum Conference (IVC) 国际真空会议
5. IEEE International Microwave Symposium (IMS) 国际微波会议 每年六月,美国
6. IEEE International Symposium on Antenna and Propagation (AP-S) 国际天线和传播会议 每年七月,美国
7. International Solid-State Circuits Conference (ISSCC) 国际固态电路年会
8. Asia and South Pacific Design Automation Conference (IEEE/ACM ASP-DAC) 亚洲及南太洋地区设计自动化年会
二、B类会议
序号 英文名称 (英文简称) 中文名称 备注
9. European Conference on Optical Communication (ECOC) 欧洲光通信会议
10. The Optoelectronics and Communications Conference (OECC) 光电子和通信会议
11. International Conference on Integrated Optics and Optical Fibre Communication (IOOC) 国际集成光学和光纤通信会议
12. International Symposium of America Vacuum Society (IVS) 美国真空学会国际会议
13. Conference on Lasers and Electro-Optics (CLEO)激光和电光会议
14. Conference on Optical Fier Sensor (OFS) 光纤传感会议
15. Asian Symposium on Information Display (ASID) 亚洲显示会议
16. International Symposium of infrared and millimeter wave (SPIE) 国际远红外与毫米波会议 每年,主要在美国
17. Asia Pacific Microwave Conference (APMC) 亚太微波会议 每年,主要在美国
18. IEEE International Symposium of Circuits and Systems (ISCAS) 国际电路与系统年会
19. European Solid-State Circuit Conference (ESSCIRC) 欧洲固态电路年会
20. IEEE International Conference on VLSI Design (ICVLSI) IEEE国际超大规模集成电路设计年会
21. Design, Automation and Test in Europe (DATE) 欧洲设计、自动化与测试年会
22. International Symposium on Low Power Electronics and Design (ISLPED) 国际低功耗电子学与设计年会
23. IEEE Annual Symposium on VLSI (ISVLSI) IEEE超大规模电路年会
注:
A类会议:本学科最顶尖级水平的国际会议;
B类会议:学术水平较高、组织工作成熟、按一定时间间隔系列性召开的国际会议。
http://www3.ntu.edu.sg/home/assourav/crank.htm
Computer Science Conference Rankings
DISCLAIMER:
The ranking of conferences are taken mostly from an informal external source. The detailed procedure behind the ranking is unknown to the author. These rankings do not necessarily represent my personal view either.
There is a possibility that some of the rankings may not be accurate, may not reflect current status of the conferences accurately, may not be complete, and there is no copyright. If you are looking for a scientifically accurate ranking please disregard this page. The author does not bear any responsibility if the ranking is inaccurate.
If you think this ranking offends you for some reason (e.g., you are not satisfied with the ranking of your conference), please just ignore the list. This is not an OFFICIAL list. This is ONLY FOR REFERENCE.
Some conferences accept multiple categories of papers. The rankings below are for the most prestigious category of paper at a given conference. All other categories should be treated as "unranked".
AREA: Databases
Rank 1:
SIGMOD: ACM SIGMOD Conf on Management of Data
PODS: ACM SIGMOD Conf on Principles of DB Systems
VLDB: Very Large Data Bases
ICDE: Intl Conf on Data Engineering
CIKM: Intl. Conf on Information and Knowledge Management
ICDT: Intl Conf on Database Theory
Rank 2:
SSD: Intl Symp on Large Spatial Databases
DEXA: Database and Expert System Applications
FODO: Intl Conf on Foundation on Data Organization
EDBT: Extending DB Technology
DOOD: Deductive and Object-Oriented Databases
DASFAA: Database Systems for Advanced Applications
SSDBM: Intl Conf on Scientific and Statistical DB Mgmt
CoopIS - Conference on Cooperative Information Systems
ER - Intl Conf on Conceptual Modeling (ER)
Rank 3:
COMAD: Intl Conf on Management of Data
BNCOD: British National Conference on Databases
ADC: Australasian Database Conference
ADBIS: Symposium on Advances in DB and Information Systems
DaWaK - Data Warehousing and Knowledge Discovery
RIDE Workshop
IFIP-DS: IFIP-DS Conference
IFIP-DBSEC - IFIP Workshop on Database Security
NGDB: Intl Symp on Next Generation DB Systems and Apps
ADTI: Intl Symp on Advanced DB Technologies and Integration
FEWFDB: Far East Workshop on Future DB Systems
MDM - Int. Conf. on Mobile Data Access/Management (MDA/MDM)
VDB - Visual Database Systems
IDEAS - International Database Engineering and Application Symposium
Others:
ARTDB - Active and Real-Time Database Systems
CODAS: Intl Symp on Cooperative DB Systems for Adv Apps
DBPL - Workshop on Database Programming Languages
EFIS/EFDBS - Engineering Federated Information (Database) Systems
KRDB - Knowledge Representation Meets Databases
NDB - National Database Conference (China)
NLDB - Applications of Natural Language to Data Bases
FQAS - Flexible Query-Answering Systems
IDC(W) - International Database Conference (HK CS)
RTDB - Workshop on Real-Time Databases
SBBD: Brazilian Symposium on Databases
WebDB - International Workshop on the Web and Databases
WAIM: Interational Conference on Web Age Information Management
DASWIS - Data Semantics in Web Information Systems
DMDW - Design and Management of Data Warehouses
DOLAP - International Workshop on Data Warehousing and OLAP
DMKD - Workshop on Research Issues in Data Mining and Knowledge Discovery
KDEX - Knowledge and Data Engineering Exchange Workshop
NRDM - Workshop on Network-Related Data Management
MobiDE - Workshop on Data Engineering for Wireless and Mobile Access
MDDS - Mobility in Databases and Distributed Systems
MEWS - Mining for Enhanced Web Search
TAKMA - Theory and Applications of Knowledge MAnagement
WIDM: International Workshop on Web Information and Data Management
W2GIS - International Workshop on Web and Wireless Geographical Information Systems
CDB - Constraint Databases and Applications
DTVE - Workshop on Database Technology for Virtual Enterprises
IWDOM - International Workshop on Distributed Object Management
OODBS - Workshop on Object-Oriented Database Systems
PDIS: Parallel and Distributed Information Systems
AREA: Artificial Intelligence and Related Subjects
Rank 1:
AAAI: American Association for AI National Conference
CVPR: IEEE Conf on Comp Vision and Pattern Recognition
IJCAI: Intl Joint Conf on AI
ICCV: Intl Conf on Computer Vision
ICML: Intl Conf on Machine Learning
KDD: Knowledge Discovery and Data Mining
KR: Intl Conf on Principles of KR & Reasoning
NIPS: Neural Information Processing Systems
UAI: Conference on Uncertainty in AI
AAMAS: Intl Conf on Autonomous Agents and Multi-Agent Systems (past: ICAA)
ACL: Annual Meeting of the ACL (Association of Computational Linguistics)
Rank 2:
NAACL: North American Chapter of the ACL
AID: Intl Conf on AI in Design
AI-ED: World Conference on AI in Education
CAIP: Inttl Conf on Comp. Analysis of Images and Patterns
CSSAC: Cognitive Science Society Annual Conference
ECCV: European Conference on Computer Vision
EAI: European Conf on AI
EML: European Conf on Machine Learning
GECCO: Genetic and Evolutionary Computation Conference (used to be GP)
IAAI: Innovative Applications in AI
ICIP: Intl Conf on Image Processing
ICNN/IJCNN: Intl (Joint) Conference on Neural Networks
ICPR: Intl Conf on Pattern Recognition
ICDAR: International Conference on Document Analysis and Recognition
ICTAI: IEEE conference on Tools with AI
AMAI: Artificial Intelligence and Maths
DAS: International Workshop on Document Analysis Systems
WACV: IEEE Workshop on Apps of Computer Vision
COLING: International Conference on Computational Liguistics
EMNLP: Empirical Methods in Natural Language Processing
EACL: Annual Meeting of European Association Computational Lingustics
CoNLL : Conference on Natural Language Learning
DocEng : ACM Symposium on Document Engineering
IEEE/WIC International Joint Conf on Web Intelligence and Intelligent Agent Technology
ICDM - IEEE International Conference on Data Mining
Rank 3:
PRICAI: Pacific Rim Intl Conf on AI
AAI: Australian National Conf on AI
ACCV: Asian Conference on Computer Vision
AI*IA: Congress of the Italian Assoc for AI
ANNIE: Artificial Neural Networks in Engineering
ANZIIS: Australian/NZ Conf on Intelligent Inf. Systems
CAIA: Conf on AI for Applications
CAAI: Canadian Artificial Intelligence Conference
ASADM: Chicago ASA Data Mining Conf: A Hard Look at DM
EPIA: Portuguese Conference on Artificial Intelligence
FCKAML: French Conf on Know. Acquisition & Machine Learning
ICANN: International Conf on Artificial Neural Networks
ICCB: International Conference on Case-Based Reasoning
ICGA: International Conference on Genetic Algorithms
ICONIP: Intl Conf on Neural Information Processing
IEA/AIE: Intl Conf on Ind. & Eng. Apps of AI & Expert Sys
ICMS: International Conference on Multiagent Systems
ICPS: International conference on Planning Systems
IWANN: Intl Work-Conf on Art & Natural Neural Networks
PACES: Pacific Asian Conference on Expert Systems
SCAI: Scandinavian Conference on Artifical Intelligence
SPICIS: Singapore Intl Conf on Intelligent System
PAKDD: Pacific-Asia Conf on Know. Discovery & Data Mining
SMC: IEEE Intl Conf on Systems, Man and Cybernetics
PAKDDM: Practical App of Knowledge Discovery & Data Mining
WCNN: The World Congress on Neural Networks
WCES: World Congress on Expert Systems
ASC: Intl Conf on AI and Soft Computing
PACLIC: Pacific Asia Conference on Language, Information and Computation
ICCC: International Conference on Chinese Computing
ICADL: International Conference on Asian Digital Libraries
RANLP: Recent Advances in Natural Language Processing
NLPRS: Natural Language Pacific Rim Symposium
Meta-Heuristics International Conference
Rank 3:
ICRA: IEEE Intl Conf on Robotics and Automation
NNSP: Neural Networks for Signal Processing
ICASSP: IEEE Intl Conf on Acoustics, Speech and SP
GCCCE: Global Chinese Conference on Computers in Education
ICAI: Intl Conf on Artificial Intelligence
AEN: IASTED Intl Conf on AI, Exp Sys & Neural Networks
WMSCI: World Multiconfs on Sys, Cybernetics & Informatics
LREC: Language Resources and Evaluation Conference
AIMSA: Artificial Intelligence: Methodology, Systems, Applications
AISC: Artificial Intelligence and Symbolic Computation
CIA: Cooperative Information Agents
International Conference on Computational Intelligence for Modelling , Control and Automation
Pattern Matching
ECAL: European Conference on Artificial Life
EKAW: Knowledge Acquisition, Modeling and Management
EMMCVPR: Energy Minimization Methods in Computer Vision and Pattern Recognition
EuroGP : European Conference on Genetic Programming
FoIKS : Foundations of Information and Knowledge Systems
IAWTIC: International Conference on Intelligent Agents, Web Technologies and Internet Commerce
ICAIL: International Conference on Artificial Intelligence and Law
SMIS: International Syposium on Methodologies for Intelligent Systems
IS&N: Intelligence and Services in Networks
JELIA: Logics in Artificial Intelligence
KI: German Conference on Artificial Intelligence
KRDB: Knowledge Representation Meets Databases
MAAMAW: Modelling Autonomous Agents in a Multi-Agent World
NC: ICSC Symposium on Neural Computation
PKDD: Principles of Data Mining and Knowledge Discovery
SBIA: Brazilian Symposium on Artificial Intelligence
Scale-Space: Scale-Space Theories in Computer Vision
XPS: Knowledge-Based Systems
I2CS: Innovative Internet Computing Systems
TARK: Theoretical Aspects of Rationality and Knowledge Meeting
MKM: International Workshop on Mathematical Knowledge Management
ACIVS: International Conference on Advanced Concepts For Intelligent Vision Systems
ATAL: Agent Theories, Architectures, and Languages
LACL: International Conference on Logical Aspects of Computational Linguistics
AREA: Hardware and Architecture
Rank 1:
ASPLOS: Architectural Support for Prog Lang and OS
ISCA: ACM/IEEE Symp on Computer Architecture
ICCAD: Intl Conf on Computer-Aided Design
DAC: Design Automation Conf
MICRO: Intl Symp on Microarchitecture
HPCA: IEEE Symp on High-Perf Comp Architecture
Rank 2:
FCCM: IEEE Symposium on Field Programmable Custom Computing Machines
SUPER: ACM/IEEE Supercomputing Conference
ICS: Intl Conf on Supercomputing
ISSCC: IEEE Intl Solid-State Circuits Conf
HCS: Hot Chips Symp
VLSI: IEEE Symp VLSI Circuits
CODES+ISSS: Intl Conf on Hardware/Software Codesign & System Synthesis
DATE: IEEE/ACM Design, Automation & Test in Europe Conference
FPL: Field-Programmable Logic and Applications
CASES: International Conference on Compilers, Architecture, and Synthesis for Embedded Systems
Rank 3:
ICA3PP: Algs and Archs for Parall Proc
EuroMICRO : New Frontiers of Information Technology
ACS: Australian Supercomputing Conf
ISC: Information Security Conference
Unranked:
Advanced Research in VLSI
International Symposium on System Synthesis
International Symposium on Computer Design
International Symposium on Circuits and Systems
Asia Pacific Design Automation Conference
International Symposium on Physical Design
International Conference on VLSI Design
CANPC: Communication, Architecture, and Applications for Network-Based Parallel Computing
CHARME: Conference on Correct Hardware Design and Verification Methods
CHES: Cryptographic Hardware and Embedded Systems
NDSS: Network and Distributed System Security Symposium
NOSA: Nordic Symposium on Software Architecture
ACAC: Australasian Computer Architecture Conference
CSCC: WSES/IEEE world multiconference on Circuits, Systems, Communications & Computers
ICN: IEEE International Conference on Networking Topology in Computer Science Conference
AREA: Applications and Media
Rank 1:
I3DG: ACM-SIGRAPH Interactive 3D Graphics
SIGGRAPH: ACM SIGGRAPH Conference
ACM-MM: ACM Multimedia Conference
DCC: Data Compression Conf
SIGMETRICS: ACM Conf on Meas. & Modelling of Comp Sys
SIGIR: ACM SIGIR Conf on Information Retrieval
PECCS: IFIP Intl Conf on Perf Eval of Comp \& Comm Sys
WWW: World-Wide Web Conference
Rank 2:
IEEE Visualization
EUROGRAPH: European Graphics Conference
CGI: Computer Graphics International
CANIM: Computer Animation
PG: Pacific Graphics
ICME: Intl Conf on MMedia & Expo
NOSSDAV: Network and OS Support for Digital A/V
PADS: ACM/IEEE/SCS Workshop on Parallel \& Dist Simulation
WSC: Winter Simulation Conference
ASS: IEEE Annual Simulation Symposium
MASCOTS: Symp Model Analysis \& Sim of Comp \& Telecom Sys
PT: Perf Tools - Intl Conf on Model Tech \& Tools for CPE
NetStore : Network Storage Symposium
MMCN: ACM/SPIE Multimedia Computing and Networking
JCDL: Joint Conference on Digital Libraries
Rank 3:
ACM-HPC: ACM Hypertext Conf
MMM: Multimedia Modelling
DSS: Distributed Simulation Symposium
SCSC: Summer Computer Simulation Conference
WCSS: World Congress on Systems Simulation
ESS: European Simulation Symposium
ESM: European Simulation Multiconference
HPCN: High-Performance Computing and Networking
Geometry Modeling and Processing
WISE
DS-RT: Distributed Simulation and Real-time Applications
IEEE Intl Wshop on Dist Int Simul and Real-Time Applications
ECIR: European Colloquium on Information Retrieval
Ed-Media
IMSA: Intl Conf on Internet and MMedia Sys
Un-ranked:
DVAT: IS\&T/SPIE Conf on Dig Video Compression Alg \& Tech
MME: IEEE Intl Conf. on Multimedia in Education
ICMSO: Intl Conf on Modelling , Simulation and Optimisation
ICMS: IASTED Intl Conf on Modelling and Simulation
COTIM: Conference on Telecommunications and Information Markets
DOA: International Symposium on Distributed Objects and Applications
ECMAST: European Conference on Multimedia Applications, Services and Techniques
GIS: Workshop on Advances in Geographic Information Systems
IDA: Intelligent Data Analysis
IDMS: Interactive Distributed Multimedia Systems and Telecommunication Services
IUI: Intelligent User Interfaces
MIS: Workshop on Multimedia Information Systems
WECWIS: Workshop on Advanced Issues of E-Commerce and Web/based Information Systems
WIDM: Web Information and Data Management
WOWMOM: Workshop on Wireless Mobile Multimedia
WSCG: International Conference in Central Europe on Computer Graphics and Visualization
LDTA: Workshop on Language Descriptions, Tools and Applications
IPDPSWPIM: International Workshop on Parallel and Distributed Computing Issues in Wireless Networks and Mobile Computing
IWST: International Workshop on Scheduling and Telecommunications
APDCM: Workshop on Advances in Parallel and Distributed Computational Models
CIMA: International ICSC Congress on Computational Intelligence: Methods and Applications
FLA: Fuzzy Logic and Applications Meeting
ICACSD: International Conference on Application of Concurrency to System Design
ICATPN: International conference on application and theory of Petri nets
AICCSA: ACS International Conference on Computer Systems and Applications
CAGD: International Symposium of Computer Aided Geometric Design
Spanish Symposium on Pattern Recognition and Image Analysis
International Workshop on Cluster Infrastructure for Web Server and E-Commerce Applications
WSES ISA: Information Science And Applications Conference
CHT: International Symposium on Advances in Computational Heat Transfer
IMACS: International Conference on Applications of Computer Algebra
VIPromCom : International Symposium on Video Processing and Multimedia Communications
PDMPR: International Workshop on Parallel and Distributed Multimedia Processing & Retrieval
International Symposium On Computational And Applied Pdes
PDCAT: International Conference on Parallel and Distributed Computing, Applications, and Techniques
Biennial Computational Techniques and Applications Conference
Symposium on Advanced Computing in Financial Markets
WCCE: World Conference on Computers in Education
ITCOM: SPIE's International Symposium on The Convergence of Information Technologies and Communications
Conference on Commercial Applications for High-Performance Computing
MSA: Metacomputing Systems and Applications Workshop
WPMC : International Symposium on Wireless Personal Multimedia Communications
WSC: Online World Conference on Soft Computing in Industrial Applications
HERCMA: Hellenic European Research on Computer Mathematics and its Applications
PARA: Workshop on Applied Parallel Computing
International Computer Science Conference: Active Media Technology
IW-MMDBMS - Int. Workshop on Multi-Media Data Base Management Systems
AREA: System Technology
Rank 1:
SIGCOMM: ACM Conf on Comm Architectures, Protocols & Apps
INFOCOM: Annual Joint Conf IEEE Comp & Comm Soc
SPAA: Symp on Parallel Algms and Architecture
PODC: ACM Symp on Principles of Distributed Computing
PPoPP : Principles and Practice of Parallel Programming
RTSS: Real Time Systems Symp
SOSP: ACM SIGOPS Symp on OS Principles
SOSDI: Usenix Symp on OS Design and Implementation
CCS: ACM Conf on Comp and Communications Security
IEEE Symposium on Security and Privacy
MOBICOM: ACM Intl Conf on Mobile Computing and Networking
USENIX Conf on Internet Tech and Sys
ICNP: Intl Conf on Network Protocols
PACT: Intl Conf on Parallel Arch and Compil Tech
RTAS: IEEE Real-Time and Embedded Technology and Applications Symposium
ICDCS: IEEE Intl Conf on Distributed Comp Systems
Rank 2:
CC: Compiler Construction
IPDPS: Intl Parallel and Dist Processing Symp
IC3N: Intl Conf on Comp Comm and Networks
ICPP: Intl Conf on Parallel Processing
SRDS: Symp on Reliable Distributed Systems
MPPOI: Massively Par Proc Using Opt Interconns
ASAP: Intl Conf on Apps for Specific Array Processors
Euro-Par: European Conf. on Parallel Computing
Fast Software Encryption
Usenix Security Symposium
European Symposium on Research in Computer Security
WCW: Web Caching Workshop
LCN: IEEE Annual Conference on Local Computer Networks
IPCCC: IEEE Intl Phoenix Conf on Comp & Communications
CCC: Cluster Computing Conference
ICC: Intl Conf on Comm
WCNC: IEEE Wireless Communications and Networking Conference
CSFW: IEEE Computer Security Foundations Workshop
Rank 3:
MPCS: Intl. Conf. on Massively Parallel Computing Systems
GLOBECOM: Global Comm
ICCC: Intl Conf on Comp Communication
NOMS: IEEE Network Operations and Management Symp
CONPAR: Intl Conf on Vector and Parallel Processing
VAPP: Vector and Parallel Processing
ICPADS: Intl Conf. on Parallel and Distributed Systems
Public Key Cryptosystems
Annual Workshop on Selected Areas in Cryptography
Australasia Conference on Information Security and Privacy
Int. Conf on Inofrm and Comm. Security
Financial Cryptography
Workshop on Information Hiding
Smart Card Research and Advanced Application Conference
ICON: Intl Conf on Networks
NCC: Nat Conf Comm
IN: IEEE Intell Network Workshop
Softcomm : Conf on Software in Tcomms and Comp Networks
INET: Internet Society Conf
Workshop on Security and Privacy in E-commerce
Un-ranked:
PARCO: Parallel Computing
SE: Intl Conf on Systems Engineering (**)
PDSECA: workshop on Parallel and Distributed Scientific and Engineering Computing with Applications
CACS: Computer Audit, Control and Security Conference
SREIS: Symposium on Requirements Engineering for Information Security
SAFECOMP: International Conference on Computer Safety, Reliability and Security
IREJVM: Workshop on Intermediate Representation Engineering for the Java Virtual Machine
EC: ACM Conference on Electronic Commerce
EWSPT: European Workshop on Software Process Technology
HotOS : Workshop on Hot Topics in Operating Systems
HPTS: High Performance Transaction Systems
Hybrid Systems
ICEIS: International Conference on Enterprise Information Systems
IOPADS: I/O in Parallel and Distributed Systems
IRREGULAR: Workshop on Parallel Algorithms for Irregularly Structured Problems
KiVS : Kommunikation in Verteilten Systemen
LCR: Languages, Compilers, and Run-Time Systems for Scalable Computers
MCS: Multiple Classifier Systems
MSS: Symposium on Mass Storage Systems
NGITS: Next Generation Information Technologies and Systems
OOIS: Object Oriented Information Systems
SCM: System Configuration Management
Security Protocols Workshop
SIGOPS European Workshop
SPDP: Symposium on Parallel and Distributed Processing
TreDS : Trends in Distributed Systems
USENIX Technical Conference
VISUAL: Visual Information and Information Systems
FoDS : Foundations of Distributed Systems: Design and Verification of Protocols conference
RV: Post-CAV Workshop on Runtime Verification
ICAIS: International ICSC-NAISO Congress on Autonomous Intelligent Systems
ITiCSE : Conference on Integrating Technology into Computer Science Education
CSCS: CyberSystems and Computer Science Conference
AUIC: Australasian User Interface Conference
ITI: Meeting of Researchers in Computer Science, Information Systems Research & Statistics
European Conference on Parallel Processing
RODLICS: Wses International Conference on Robotics, Distance Learning & Intelligent Communication Systems
International Conference On Multimedia, Internet & Video Technologies
PaCT : Parallel Computing Technologies workshop
PPAM: International Conference on Parallel Processing and Applied Mathematics
International Conference On Information Networks, Systems And Technologies
AmiRE : Conference on Autonomous Minirobots for Research and Edutainment
DSN: The International Conference on Dependable Systems and Networks
IHW: Information Hiding Workshop
GTVMT: International Workshop on Graph Transformation and Visual Modeling Techniques
AREA: Programming Languages and Software Engineering
Rank 1:
POPL: ACM-SIGACT Symp on Principles of Prog Langs
PLDI: ACM-SIGPLAN Symp on Prog Lang Design & Impl
OOPSLA: OO Prog Systems, Langs and Applications
ICFP: Intl Conf on Function Programming
JICSLP/ICLP/ILPS: (Joint) Intl Conf/Symp on Logic Prog
ICSE: Intl Conf on Software Engineering
FSE: ACM Conf on the Foundations of Software Engineering (inc: ESEC-FSE)
FM/FME: Formal Methods, World Congress/Europe
CAV: Computer Aided Verification
Rank 2:
CP: Intl Conf on Principles & Practice of Constraint Prog
TACAS: Tools and Algos for the Const and An of Systems
ESOP: European Conf on Programming
ICCL: IEEE Intl Conf on Computer Languages
PEPM: Symp on Partial Evalutation and Prog Manipulation
SAS: Static Analysis Symposium
RTA: Rewriting Techniques and Applications
IWSSD: Intl Workshop on S/W Spec & Design
CAiSE : Intl Conf on Advanced Info System Engineering
SSR: ACM SIGSOFT Working Conf on Software Reusability
SEKE: Intl Conf on S/E and Knowledge Engineering
ICSR: IEEE Intl Conf on Software Reuse
ASE: Automated Software Engineering Conference
PADL: Practical Aspects of Declarative Languages
ISRE: Requirements Engineering
ICECCS: IEEE Intl Conf on Eng. of Complex Computer Systems
IEEE Intl Conf on Formal Engineering Methods
Intl Conf on Integrated Formal Methods
FOSSACS: Foundations of Software Science and Comp Struct
APLAS: Asian Symposium on Programming Languages and Systems
MPC: Mathematics of Program Construction
ECOOP: European Conference on Object-Oriented Programming
ICSM: Intl. Conf on Software Maintenance
HASKELL - Haskell Workshop
Rank 3:
FASE: Fund Appr to Soft Eng
APSEC: Asia-Pacific S/E Conf
PAP/PACT: Practical Aspects of PROLOG/Constraint Tech
ALP: Intl Conf on Algebraic and Logic Programming
PLILP: Prog , Lang Implentation & Logic Programming
LOPSTR: Intl Workshop on Logic Prog Synthesis & Transf
ICCC: Intl Conf on Compiler Construction
COMPSAC: Intl. Computer S/W and Applications Conf
TAPSOFT: Intl Joint Conf on Theory & Pract of S/W Dev
WCRE: SIGSOFT Working Conf on Reverse Engineering
AQSDT: Symp on Assessment of Quality S/W Dev Tools
IFIP Intl Conf on Open Distributed Processing
Intl Conf of Z Users
IFIP Joint Int'l Conference on Formal Description Techniques and Protocol Specification, Testing, And Verification
PSI (Ershov conference)
UML: International Conference on the Unified Modeling Language
Un-ranked:
Australian Software Engineering Conference
IEEE Int. W'shop on Object-oriented Real-time Dependable Sys. (WORDS)
IEEE International Symposium on High Assurance Systems Engineering
The Northern Formal Methods Workshops
Formal Methods Pacific
Int. Workshop on Formal Methods for Industrial Critical Systems
JFPLC - International French Speaking Conference on Logic and Constraint Programming
L&L - Workshop on Logic and Learning
SFP - Scottish Functional Programming Workshop
LCCS - International Workshop on Logic and Complexity in Computer Science
VLFM - Visual Languages and Formal Methods
NASA LaRC Formal Methods Workshop
PASTE: Workshop on Program Analysis For Software Tools and Engineering
TLCA: Typed Lambda Calculus and Applications
FATES - A Satellite workshop on Formal Approaches to Testing of Software
Workshop On Java For High-Performance Computing
DSLSE - Domain-Specific Languages for Software Engineering
FTJP - Workshop on Formal Techniques for Java Programs
WFLP - International Workshop on Functional and (Constraint) Logic Programming
FOOL - International Workshop on Foundations of Object-Oriented Languages
SREIS - Symposium on Requirements Engineering for Information Security
HLPP - International workshop on High-level parallel programming and applications
INAP - International Conference on Applications of Prolog
MPOOL - Workshop on Multiparadigm Programming with OO Languages
PADO - Symposium on Programs as Data Objects
TOOLS: Int'l Conf Technology of Object-Oriented Languages and Systems
Australasian Conference on Parallel And Real-Time Systems
PASTE: Workshop on Program Analysis For Software Tools and Engineering
AvoCS : Workshop on Automated Verification of Critical Systems
SPIN: Workshop on Model Checking of Software
FemSys : Workshop on Formal Design of Safety Critical Embedded Systems
Ada -Europe
PPDP: Principles and Practice of Declarative Programming
APL Conference
ASM: Workshops on Abstract State Machines
COORDINATION: Coordination Models and Languages
DocEng : ACM Symposium on Document Engineering
DSV-IS: Design, Specification, and Verification of Interactive Systems
FMCAD: Formal Methods in Computer-Aided Design
FMLDO: Workshop on Foundations of Models and Languages for Data and Objects
IFL: Implementation of Functional Languages
ILP: International Workshop on Inductive Logic Programming
ISSTA: International Symposium on Software Testing and Analysis
ITC: International Test Conference
IWFM: Irish Workshop in Formal Methods
Java Grande
LP: Logic Programming: Japanese Conference
LPAR: Logic Programming and Automated Reasoning
LPE: Workshop on Logic Programming Environments
LPNMR: Logic Programming and Non-monotonic Reasoning
PJW: Workshop on Persistence and Java
RCLP: Russian Conference on Logic Programming
STEP: Software Technology and Engineering Practice
TestCom : IFIP International Conference on Testing of Communicating Systems
VL: Visual Languages
FMPPTA: Workshop on Formal Methods for Parallel Programming Theory and Applications
WRS: International Workshop on Reduction Strategies in Rewriting and Programming
FATES: A Satellite workshop on Formal Approaches to Testing of Software
FORMALWARE: Meeting on Formalware Engineering: Formal Methods for Engineering Software
DRE: conference Data Reverse Engineering
STAREAST: Software Testing Analysis & Review Conference
Conference on Applied Mathematics and Scientific Computing
International Testing Computer Software Conference
Linux Showcase & Conference
FLOPS: International Symposum on Functional and Logic Programming
GCSE: International Conference on Generative and Component-Based Software Engineering
JOSES: Java Optimization Strategies for Embedded Systems
AADEBUG: Automated and Algorithmic Debugging
AMAST: Algebraic Methodology and Software Technology
AREA: Algorithms and Theory
Rank 1:
STOC: ACM Symp on Theory of Computing
FOCS: IEEE Symp on Foundations of Computer Science
COLT: Computational Learning Theory
LICS: IEEE Symp on Logic in Computer Science
SCG: ACM Symp on Computational Geometry
SODA: ACM/SIAM Symp on Discrete Algorithms
SPAA: ACM Symp on Parallel Algorithms and Architectures
ISSAC: Intl. Symp on Symbolic and Algebraic Computation
CRYPTO: Advances in Cryptology
Rank 2:
EUROCRYPT: European Conf on Cryptography
CONCUR: International Conference on Concurrency Theory
ICALP: Intl Colloquium on Automata, Languages and Prog
STACS: Symp on Theoretical Aspects of Computer Science
CC: IEEE Symp on Computational Complexity
WADS: Workshop on Algorithms and Data Structures
MFCS: Mathematical Foundations of Computer Science
SWAT: Scandinavian Workshop on Algorithm Theory
ESA: European Symp on Algorithms
IPCO: MPS Conf on integer programming & comb optimization
LFCS: Logical Foundations of Computer Science
ALT: Algorithmic Learning Theory
EUROCOLT: European Conf on Learning Theory
DSIC: Int'l Symp om Distributed Computing (formally WDAG: Workshop on Distributed Algorithms)
ISTCS: Israel Symp on Theory of Computing and Systems
ISAAC: Intl Symp on Algorithms and Computation
FST&TCS: Foundations of S/W Tech & Theoretical CS
LATIN: Intl Symp on Latin American Theoretical Informatics
CADE: Conf on Automated Deduction
IEEEIT: IEEE Symposium on Information Theory
Asiacrypt
Rank 3:
MEGA: Methods Effectives en Geometrie Algebrique
ASIAN: Asian Computing Science Conf
CCCG: Canadian Conf on Computational Geometry
FCT: Fundamentals of Computation Theory
WG: Workshop on Graph Theory
CIAC: Italian Conf on Algorithms and Complexity
ICCI: Advances in Computing and Information
AWTI: Argentine Workshop on Theoretical Informatics
CATS: The Australian Theory Symp
COCOON: Annual Intl Computing and Combinatorics Conf
UMC: Unconventional Models of Computation
MCU: Universal Machines and Computations
GD: Graph Drawing
SIROCCO: Structural Info & Communication Complexity
ALEX: Algorithms and Experiments
ALG: ENGG Workshop on Algorithm Engineering
LPMA: Intl Workshop on Logic Programming and Multi-Agents
EWLR: European Workshop on Learning Robots
CITB: Complexity & info-theoretic approaches to biology
FTP: Intl Workshop on First-Order Theorem Proving (FTP)
CSL: Annual Conf on Computer Science Logic (CSL)
AAAAECC: Conf On Applied Algebra, Algebraic Algms & ECC
DMTCS: Intl Conf on Disc Math and TCS
JCDCG: Japan Conference on Discrete and Computational Geometry
Un-ranked:
Information Theory Workshop
CL: International Conference on Computational Logic
COSIT: Spatial Information Theory
ETAPS: European joint conference on Theory And Practice of Software
ICCS: International Conference on Conceptual Structures
ICISC: Information Security and Cryptology
PPSN: Parallel Problem Solving from Nature
SOFSEM: Conference on Current Trends in Theory and Practice of Informatics
TPHOLs : Theorem Proving in Higher Order Logics
WADT: Workshop on Algebraic Development Techniques
TERM: THEMATIC TERM: Semigroups , Algorithms, Automata and Languages
IMGTA: Italian Meeting on Game Theory and Applications
DLT: Developments in Language Theory
International Workshop on Discrete Algorithms and Methods for Mobile Computing and Communications
APPROX: International Workshop on Approximation Algorithms for Combinatorial Optimization Problems
WAE: Workshop on Algorithm Engineering
CMFT: Computational Methods and Function Theory
AWOCA: Australasian Workshop on Combinatorial Algorithms
Fun with Algorithms Meeting
ICTCS: Italian Conference on Theoretical Computer Science
ComMaC : International Conference On Computational Mathematics
TLCA: Typed Lambda Calculus and Applications
DCAGRS: Workshop on Descriptional Complexity of Automata, Grammars and Related Structures
AREA: Biomedical
Rank 1:
RECOMB: Annual Intl Conf on Comp Molecular Biology
ISMB: International Conference on Intelligent Systems for Molecular Biology
Rank 2:
AMIA: American Medical Informatics Annual Fall Symposium
DNA: Meeting on DNA Based Computers
WABI: Workshop on Algorithms in Bioinformatics
Rank 3:
MEDINFO: World Congress on Medical Informatics
International Conference on Sequences and their Applications
ECAIM: European Conf on AI in Medicine
APAMI: Asia Pacific Assoc for Medical Informatics Conf
INBS: IEEE Intl Symp on Intell . in Neural & Bio Systems
Un-ranked:
MCBC: Wses conf on Mathematics And Computers In Biology And Chemistry
KDDMBD - Knowledge Discovery and Data Mining in Biological Databases Meeting
AREA: Miscellaneous
Rank 1:
Rank 2:
CSCW: Conference on Computer Supported Cooperative Work (*)
Rank 3:
SAC: ACM/SIGAPP Symposium on Applied Computing
ICSC: Internal Computer Science Conference
ISCIS: Intl Symp on Computer and Information Sciences
ICSC2: International Computer Symposium Conference
ICCE: Intl Conf on Comps in Edu
WCC: World Computing Congress
PATAT: Practice and Theory of Automated Timetabling
Unranked:
ICCI: International Conference on Cognitive Informatics
APISIT: Asia Pacific International Symposium on Information Technology
CW: The International Conference on Cyberworlds
Workshop on Open Hypermedia Systems
Workshop on Middleware for Mobile Computing
International Working Conference on Distributed Applications and Interoperable Systems
ADL: Advances in Digital Libraries
ADT: Specification of Abstract Data Type Workshops
AVI: Working Conference on Advanced Visual Interfaces
DL: Digital Libraries
DLog : Description Logics
ECDL: European Conference on Digital Libraries
EDCC: European Dependable Computing Conference
FroCos : Frontiers of Combining Systems
FTCS: Symposium on Fault-Tolerant Computing
IFIP World Computer Congress
INTEROP: Interoperating Geographic Information Systems
IO: Information Outlook
IQ: MIT Conference on Information Quality
IUC: International Unicode Conference
IWMM: International Workshop on Memory Management
MD: IEEE Meta-Data Conference
Middleware
MLDM: Machine Learning and Data Mining in Pattern Recognition
POS: Workshop on Persistent Object Systems
SCCC: International Conference of the Chilean Computer Science Society
SPIRE: String Processing and Information Retrieval
TABLEAUX: Analytic Tableaux and Related Methods
TIME Workshops
TREC: Text REtrieval Conference
UIDIS: User Interfaces to Data Intensive Systems
VRML Conference
AFIPS: American Federation of Information Processing Societies
ACSC: Australasian Computer Science Conference
CMCS: Coalgebraic Methods in Computer Science
BCTCS: British Colloquium for Theoretical Computer Science
IJCAR: The International Joint Conference on Automated Reasoning
STRATEGIES: International Workshop on Strategies in Automated Deduction
UNIF: International Workshop on Unification
SOCO: Meeting on Soft Computing
ConCoord : International Workshop on Concurrency and Coordination
CIAA: International Conference on Implementation and Application of Automata
Workshop on Information Stucture , Discourse Structure and Discourse Semantics
RANDOM: International Workshop on Randomization and Approximation Techniques in Computer Science
WMC: Workshop on Membrane Computing
FI-CS: Fixed Points in Computer Science
DC Computer Science Conference
Workshop on Novel Approaches to Hard Discrete Optimization
NALAC: Numerical Analysis, Linear Algebra And Computations Conference
ICLSSC: International Conference on Large-Scale Scientific Computations
ISACA : Information Systems Audit and Control Association International Conference
ICOSAHOM: International Conference On Spectral And High Order Methods
AIP: International Conference on Applied Inverse Problems: Theoretical and Computational Aspects
ECCM: European Conference On Computational Mechanics
Scicade : Scientific Computing and Differential Equation
BMVC: British Machine Vision Conference
COMEP: Euroconference On Computational Mechanics And Engineering Practis
JCIS: Joint Conference on Information Sciences
CHP: Compilers for High Performance conference
SIAM Conference on Geometric Design and Computing
SCI 文章点滴
SCI论文全攻略之构思与撰文
推荐一、写作框架和各部分要求
Title: Be short, accurate, and unambiguous; Give your paper a distinct personality; Begin with the subject of the study.
Introduction: What is known; What is unknown; Why we did this study?
Methods: Participants, subjects; Measurements; Outcomes and explanatory variables; Statistical methods.
Results: Sample characteristics; Univariate analyses; Bivariate analyses; Multivariate analyses.
Tables and figures: No more than six tables or figures; Use Table 1 for sample characteristics (no P values); Put most important findings in a figure.
Discussion: State what you found; Outline the strengths and limitations of the study; Discuss the relevance to current literature; Outline your implications with a clear "So what?" and "Where now?"
References: All citations must be accurate; Include only the most important, most rigorous, and most recent literature; Quote only published journal articles or books; Never quote "second hand"; Cite only 20-35 references.
Formatting: Include the title, author, page numbers, etc. in headers and footers; Start each section on a new page; Format titles and subtitles consistently; Comply with "Instructions to authors".
二、英文写作的语言技巧
1. Introduction:
A. 如何指出当前研究的不足并有目的地引导出自己研究的重要性?在叙述前人成果之后,用However来引导不足,提出一种新方法或新方向。如:However, little information(little attention/little work/little data/little research……)(or few studies/few investigations/few researchers/few attempts……)(or no/none of these studies……)has(have)been done on(focused on/attempted to/conducted/investigated/studied(with respect to))。如:Previous research (studies, records) has (have) failed to consider/ ignored/ misinterpreted/ neglected to/overestimated, underestimated/misleaded. thus, these previous results are inconclisive, misleading, unsatisfactory, questionable, controversial. Uncertainties (discrepancies) still exist……研究方法和方向与前人一样时,可通过以下方式强调自己工作:However, data is still scarce(rare, less accurate),We need to(aim to, have to) provide more documents(data, records, studies, increase the dataset). Further studies are still necessary(essential)……
强调自己研究的重要性,一般还要在However之前介绍与自己研究问题相反或相关的问题。比如:(1)时间问题;(2)研究手段问题;(3)研究区域问题;(4)不确定性;(5)提出自己的假设来验证。如果你研究的问题在时间上比较新,你可大量提及时间较老问题的研究及重要性,然后(However)表明“对时间尺度比较新的问题研究不足”;如果你的是一种新的研究手段或研究方向,你可提出当前流行的方法及其物质性质,然后(However)说对你所研究的方向方法研究甚少;如果研究涉及区域问题,就先总结相邻区域或其它区域的研究,然后(However)强调这一区域的研究不足;虽然前人对某一问题研究很多,但目前有两种或更多种观点,这种uncertainties或ambiguities值得进一步澄清;如果自己的研究是全是新的,没有前人的工作可对比,你就可以自信地说“根据假设提出的过程,存在这种可能的结果,本文就是要证实这种结果”等等。We aim to test the feasibility (reliability) of the……It is hoped that the question will be resolved (fall away) with our proposed method (approach).
B. 提出自己的观点:We aim to//This paper reports on//This paper provides results//This paper extends the method//This paper focus on……The purpose of this paper is to……Furthermore, Moreover, In addition, we will also discuss……
C. 圈定自己的研究范围:introduction的另一个作用就是告诉读者(包括reviewer),你文章的主要研究内容。如果处理不好,reviewer会提出严厉的建议,比如你没有考虑某种可能性,某种研究手段等。为减少这种争论,在前言的结尾就必须明确提出本文研究的范围:(1)时间尺度;(2) 研究区域等。如涉及较长的时序,你可明确提出本文只关心某一特定时间范围的问题,We preliminarily focus on the older (younger)……如有两种时间尺度 (long-term and short term),你可说两者都重要,但是本文只涉及其中一种。研究区域的问题,和时间问题一样,也需明确提出你只关心某一特定区域!
D. 最后的圆场:在前言的最后,还可以总结性地提出“这一研究对其它研究有什么帮助”;或者说further studies on……will be summarized in our next study (or elsewhere)。总之,其目的就是让读者把思路集中到你要讨论的问题上来。尽量减少不必要的争论(arguments)。
2. Discussion:
A. 怎样提出观点:在提出自己的观点时,采取什么样的策略很重要,不合适的句子通常会遭到reviewer置疑。(1)如果观点不是这篇文章最新提出的,通常要用We confirm that……(2)对于自己很自信的观点,可用We believe that……(3)通常,由数据推断出一定的结论,用Results indicate, infer, suggest, imply that……(4) 在极其特别时才可用We put forward(discover, observe)……"for the first time"来强调自己的创新……(5) 如果自己对所提出的观点不完全肯定,可用We tentatively put forward (interrprete this to…)Or The results may be due to (caused by) attributed to resulted from……Or This is probably a consequence of……It seems that……can account for (interpret) this……Or It is posible that it stem from……要注意这些结构要合理搭配。如果通篇是类型1)和5),那这篇文章的意义就大打折扣。如果全是2),肯定会遭到置疑。所以要仔细分析自己成果的创新性以及可信度。
B. 连接词与逻辑:写英文论文最常见的毛病是文章的逻辑不清楚,解决方法如下。
(1)注意句子上下连贯,不能让句子独立。常见的连接词有,However, also, in addition, consequently, afterwards, moreover, Furthermore, further, although, unlike, in contrast, Similarly, Unfortunately, alternatively, parallel results, In order to, despite, For example, Compared with, other results, thus, therefore……用好连接词能使文章层次清楚,意思明确。比如,叙述有时间顺序的事件或文献,最早的文献可用AA advocated it for the first time.接下来可用Then BB further demonstrated that. 再接下来,可用Afterwards, CC……如果还有,可用More recent studies by DD……如果叙述两种观点,要把它们截然分开AA put forward that……In contrast, BB believe or Unlike AA, BB suggest or On the contrary (表明前面观点错误),如果只表明两种观点对立,用in contrast BB……如果两种观点相近,可用AA suggest……Similarily, alternatively, BB……Or Also, BB or BB allso does……表示因果或者前后关系可用Consequently, therefore, as a result……表明递进关系可用furthermore, further, moreover, in addition……写完一段英文,最好首先检查是否较好地应用了这些连接词。
(2) 注意段落布局的整体逻辑:经常我们要叙述一个问题的几个方面。这种情况下,一定要注意逻辑结构。第一段要明确告诉读者你要讨论几个部份……Therefore, there are three aspects of this problem have to be addressed. The first question involves……The second problem relates to……The third aspect deals with……清晰地把观点逐层叙述。也可以直接用First, Second, Third, Finally……当然,Furthermore, in addition等可以用来补充说明。
(3) 讨论部份的整体结构:小标题是把问题分为几个片段的好方法。通常第一个片段指出文章最重要的数据或结果;补充说明部份放在最后一个片段。一定要明白,文章的读者分为多个档次;除了本专业的专业人士读懂以外,一定要想办法能让更多的外专业人读懂。所以可以把讨论部份分为两部份,一部份提出观点,另一部份详细介绍过程以及论述的依据。这样专业外的人士可以了解文章的主要观点,比较专业的讨论他可以把它当成黑箱子,而这一部份本专业人士可以进一步研究。
C.讨论部分包括什么内容?(1)主要数据及其特征的总结;(2)主要结论及与前人观点的对比;(3) 本文的不足。对第三点,一般作者看来不可取,但事实上给出文章的不足恰恰是保护自己文章的重要手段。如果刻意隐藏文章的漏洞,觉得别人看不出来,是非常不明智的。所谓不足,包括以下内容:(1)研究的问题有点片面,讨论时一定要说,It should be noted that this study has examined only……We concentrate (focus) on only……We have to point out that we do not……Some limitations of this study are……(2)结论有些不足,The results do not imply……The results can not be used to determine(or be taken as evidence of)……Unfortunately, we can not determine this from this data……Our results are lack of……但指出这些不足之后,一定要马上再次加强本文的重要性以及可能采取的手段来解决这些不足,为别人或者自己的下一步研究打下伏笔。Not withstanding its limitation, this study does suggest……However, these problems could be solved if we consider……Despite its preliminary character, this study can clearly indicate……用中文来说这是左右逢源,把审稿人想到的问题提前给一个交代,同时表明你已经在思考这些问题,但是由于文章长度,试验进度或者试验手段的制约,暂时不能回答这些问题。但通过你的一些建议,这些问题在将来的研究中有可能实现。
3. Others:
A. 为使文章清楚,第一次提出概念时,最好以括弧给出较详细解释。如文章用了很多Abbreviation可用两种方法解决:(1) 在文章最后加上个Appendix,把所有Abbreviation列表;(2)在不同页面上不时地给出Abbreviation的含义,用来提醒读者。
B. 绝对不能全面否定前人的成果,即使在你看来前人的结论完全不对。这是对前人工作最起码的尊重,英文叫做给别人的工作credits.所以文章不要出现非常negative的评价,比如Their results are wrong, very questionable, have no commensence, etc.遇到这类情况,可以婉转地提出:Their studies may be more reasonable if they had……considered this situation.Their results could be better convinced if they……Or Their conclusion may remain some uncertanties.
推荐一、写作框架和各部分要求
Title: Be short, accurate, and unambiguous; Give your paper a distinct personality; Begin with the subject of the study.
Introduction: What is known; What is unknown; Why we did this study?
Methods: Participants, subjects; Measurements; Outcomes and explanatory variables; Statistical methods.
Results: Sample characteristics; Univariate analyses; Bivariate analyses; Multivariate analyses.
Tables and figures: No more than six tables or figures; Use Table 1 for sample characteristics (no P values); Put most important findings in a figure.
Discussion: State what you found; Outline the strengths and limitations of the study; Discuss the relevance to current literature; Outline your implications with a clear "So what?" and "Where now?"
References: All citations must be accurate; Include only the most important, most rigorous, and most recent literature; Quote only published journal articles or books; Never quote "second hand"; Cite only 20-35 references.
Formatting: Include the title, author, page numbers, etc. in headers and footers; Start each section on a new page; Format titles and subtitles consistently; Comply with "Instructions to authors".
二、英文写作的语言技巧
1. Introduction:
A. 如何指出当前研究的不足并有目的地引导出自己研究的重要性?在叙述前人成果之后,用However来引导不足,提出一种新方法或新方向。如:However, little information(little attention/little work/little data/little research……)(or few studies/few investigations/few researchers/few attempts……)(or no/none of these studies……)has(have)been done on(focused on/attempted to/conducted/investigated/studied(with respect to))。如:Previous research (studies, records) has (have) failed to consider/ ignored/ misinterpreted/ neglected to/overestimated, underestimated/misleaded. thus, these previous results are inconclisive, misleading, unsatisfactory, questionable, controversial. Uncertainties (discrepancies) still exist……研究方法和方向与前人一样时,可通过以下方式强调自己工作:However, data is still scarce(rare, less accurate),We need to(aim to, have to) provide more documents(data, records, studies, increase the dataset). Further studies are still necessary(essential)……
强调自己研究的重要性,一般还要在However之前介绍与自己研究问题相反或相关的问题。比如:(1)时间问题;(2)研究手段问题;(3)研究区域问题;(4)不确定性;(5)提出自己的假设来验证。如果你研究的问题在时间上比较新,你可大量提及时间较老问题的研究及重要性,然后(However)表明“对时间尺度比较新的问题研究不足”;如果你的是一种新的研究手段或研究方向,你可提出当前流行的方法及其物质性质,然后(However)说对你所研究的方向方法研究甚少;如果研究涉及区域问题,就先总结相邻区域或其它区域的研究,然后(However)强调这一区域的研究不足;虽然前人对某一问题研究很多,但目前有两种或更多种观点,这种uncertainties或ambiguities值得进一步澄清;如果自己的研究是全是新的,没有前人的工作可对比,你就可以自信地说“根据假设提出的过程,存在这种可能的结果,本文就是要证实这种结果”等等。We aim to test the feasibility (reliability) of the……It is hoped that the question will be resolved (fall away) with our proposed method (approach).
B. 提出自己的观点:We aim to//This paper reports on//This paper provides results//This paper extends the method//This paper focus on……The purpose of this paper is to……Furthermore, Moreover, In addition, we will also discuss……
C. 圈定自己的研究范围:introduction的另一个作用就是告诉读者(包括reviewer),你文章的主要研究内容。如果处理不好,reviewer会提出严厉的建议,比如你没有考虑某种可能性,某种研究手段等。为减少这种争论,在前言的结尾就必须明确提出本文研究的范围:(1)时间尺度;(2) 研究区域等。如涉及较长的时序,你可明确提出本文只关心某一特定时间范围的问题,We preliminarily focus on the older (younger)……如有两种时间尺度 (long-term and short term),你可说两者都重要,但是本文只涉及其中一种。研究区域的问题,和时间问题一样,也需明确提出你只关心某一特定区域!
D. 最后的圆场:在前言的最后,还可以总结性地提出“这一研究对其它研究有什么帮助”;或者说further studies on……will be summarized in our next study (or elsewhere)。总之,其目的就是让读者把思路集中到你要讨论的问题上来。尽量减少不必要的争论(arguments)。
2. Discussion:
A. 怎样提出观点:在提出自己的观点时,采取什么样的策略很重要,不合适的句子通常会遭到reviewer置疑。(1)如果观点不是这篇文章最新提出的,通常要用We confirm that……(2)对于自己很自信的观点,可用We believe that……(3)通常,由数据推断出一定的结论,用Results indicate, infer, suggest, imply that……(4) 在极其特别时才可用We put forward(discover, observe)……"for the first time"来强调自己的创新……(5) 如果自己对所提出的观点不完全肯定,可用We tentatively put forward (interrprete this to…)Or The results may be due to (caused by) attributed to resulted from……Or This is probably a consequence of……It seems that……can account for (interpret) this……Or It is posible that it stem from……要注意这些结构要合理搭配。如果通篇是类型1)和5),那这篇文章的意义就大打折扣。如果全是2),肯定会遭到置疑。所以要仔细分析自己成果的创新性以及可信度。
B. 连接词与逻辑:写英文论文最常见的毛病是文章的逻辑不清楚,解决方法如下。
(1)注意句子上下连贯,不能让句子独立。常见的连接词有,However, also, in addition, consequently, afterwards, moreover, Furthermore, further, although, unlike, in contrast, Similarly, Unfortunately, alternatively, parallel results, In order to, despite, For example, Compared with, other results, thus, therefore……用好连接词能使文章层次清楚,意思明确。比如,叙述有时间顺序的事件或文献,最早的文献可用AA advocated it for the first time.接下来可用Then BB further demonstrated that. 再接下来,可用Afterwards, CC……如果还有,可用More recent studies by DD……如果叙述两种观点,要把它们截然分开AA put forward that……In contrast, BB believe or Unlike AA, BB suggest or On the contrary (表明前面观点错误),如果只表明两种观点对立,用in contrast BB……如果两种观点相近,可用AA suggest……Similarily, alternatively, BB……Or Also, BB or BB allso does……表示因果或者前后关系可用Consequently, therefore, as a result……表明递进关系可用furthermore, further, moreover, in addition……写完一段英文,最好首先检查是否较好地应用了这些连接词。
(2) 注意段落布局的整体逻辑:经常我们要叙述一个问题的几个方面。这种情况下,一定要注意逻辑结构。第一段要明确告诉读者你要讨论几个部份……Therefore, there are three aspects of this problem have to be addressed. The first question involves……The second problem relates to……The third aspect deals with……清晰地把观点逐层叙述。也可以直接用First, Second, Third, Finally……当然,Furthermore, in addition等可以用来补充说明。
(3) 讨论部份的整体结构:小标题是把问题分为几个片段的好方法。通常第一个片段指出文章最重要的数据或结果;补充说明部份放在最后一个片段。一定要明白,文章的读者分为多个档次;除了本专业的专业人士读懂以外,一定要想办法能让更多的外专业人读懂。所以可以把讨论部份分为两部份,一部份提出观点,另一部份详细介绍过程以及论述的依据。这样专业外的人士可以了解文章的主要观点,比较专业的讨论他可以把它当成黑箱子,而这一部份本专业人士可以进一步研究。
C.讨论部分包括什么内容?(1)主要数据及其特征的总结;(2)主要结论及与前人观点的对比;(3) 本文的不足。对第三点,一般作者看来不可取,但事实上给出文章的不足恰恰是保护自己文章的重要手段。如果刻意隐藏文章的漏洞,觉得别人看不出来,是非常不明智的。所谓不足,包括以下内容:(1)研究的问题有点片面,讨论时一定要说,It should be noted that this study has examined only……We concentrate (focus) on only……We have to point out that we do not……Some limitations of this study are……(2)结论有些不足,The results do not imply……The results can not be used to determine(or be taken as evidence of)……Unfortunately, we can not determine this from this data……Our results are lack of……但指出这些不足之后,一定要马上再次加强本文的重要性以及可能采取的手段来解决这些不足,为别人或者自己的下一步研究打下伏笔。Not withstanding its limitation, this study does suggest……However, these problems could be solved if we consider……Despite its preliminary character, this study can clearly indicate……用中文来说这是左右逢源,把审稿人想到的问题提前给一个交代,同时表明你已经在思考这些问题,但是由于文章长度,试验进度或者试验手段的制约,暂时不能回答这些问题。但通过你的一些建议,这些问题在将来的研究中有可能实现。
3. Others:
A. 为使文章清楚,第一次提出概念时,最好以括弧给出较详细解释。如文章用了很多Abbreviation可用两种方法解决:(1) 在文章最后加上个Appendix,把所有Abbreviation列表;(2)在不同页面上不时地给出Abbreviation的含义,用来提醒读者。
B. 绝对不能全面否定前人的成果,即使在你看来前人的结论完全不对。这是对前人工作最起码的尊重,英文叫做给别人的工作credits.所以文章不要出现非常negative的评价,比如Their results are wrong, very questionable, have no commensence, etc.遇到这类情况,可以婉转地提出:Their studies may be more reasonable if they had……considered this situation.Their results could be better convinced if they……Or Their conclusion may remain some uncertanties.
Main Bayesian Wedpages
1、http://www.cs.berkeley.edu/~murphyk/Bayes/bayes.html
2、http://www.bayesian.org/
3、http://www.bayes.com/
4、http://www.bayesinf.com/
5、http://xxx.lanl.gov/archive/bayes-an/
2、http://www.bayesian.org/
3、http://www.bayes.com/
4、http://www.bayesinf.com/
5、http://xxx.lanl.gov/archive/bayes-an/
Sunday, May 23, 2010
Tuesday, May 18, 2010
Friday, April 30, 2010
Sunday, March 7, 2010
Tuesday, February 16, 2010
Friday, February 12, 2010
Bayesian Network Repository
--------------------------------------------------------------------------------
Contents
About This Page
The datasets
Network formats and Utilities
Related Sites
--------------------------------------------------------------------------------
Mission
Our in intention is to construct a repository that will allow us empirical research within our community by facilitating (1)better reproducibility of results, and (2) better comparisons among competing approach. Both of these are required to measure progress on problems that are commonly agreed upon, such as inference and learning.
A motivation for this repository is outlined in "Challenge: Where is the impact of Bayesian networks in learning?" by N. Friedman, M. Goldszmidt, D. Heckerman, and S. Russell (IJCAI-97).
This will be achieved by several progressive steps:
Sharing domains. This would allow for reproduction of results, and also allow researchers in the community to run large scale empirical tests.
Sharing task specification. Sharing domains is not enough to compare algorithms. Thus, even if two papers examine inference in particular network, they might be answering different queries or assuming different evidence sets. The intent here is to store specific tasks. For example, in inference this might be a specific series of observations/queries. In learning, this might be a particular collection of training sets that have a particular pattern of missing data.
Sharing task evaluation. Even if two researchers examine the same task, they might use different measures to evaluate their algorithms. By sharing evaluation methods, we hope to allow for an objective comparison. In some cases such evaluation methods can be shared programs, such as a program the evaluates the quality of learned model by computing KL divergence to the "real" distribution. In other cases, such an evaluation method might be an agreed upon evaluation of performance, such as space requirements, number of floating point operations, etc.
Organized competitions. One of the dangers of empirical research is that the methods examined become overly tuned to specific evaluation domains. To avoid that danger, it is necessary to use "fresh" problems. The intention is to organize competitions that would address a specific problems, such as causal discovery, on unseen domains.
Plans for the future
Currently, this site contains several domains. The plan is to gradually add other components discussed above.
Please send suggestions and contributions to galel@cs.huji.ac.il.
Acknowledgements
Thanks to Fabio Cozman, Bruce D'Ambrosio, Moises Goldszmidt, David Heckerman, Othar Hansson, Daphne Koller, and Stuart Russell for discussions about the organization of this site. Thanks to John Binder, Jack Breese, David Heckerman, Uffe Kjaeruff, and Mark Peot, for contributing networks.
--------------------------------------------------------------------------------
galel@cs.huji.ac.il
Contents
About This Page
The datasets
Network formats and Utilities
Related Sites
--------------------------------------------------------------------------------
Mission
Our in intention is to construct a repository that will allow us empirical research within our community by facilitating (1)better reproducibility of results, and (2) better comparisons among competing approach. Both of these are required to measure progress on problems that are commonly agreed upon, such as inference and learning.
A motivation for this repository is outlined in "Challenge: Where is the impact of Bayesian networks in learning?" by N. Friedman, M. Goldszmidt, D. Heckerman, and S. Russell (IJCAI-97).
This will be achieved by several progressive steps:
Sharing domains. This would allow for reproduction of results, and also allow researchers in the community to run large scale empirical tests.
Sharing task specification. Sharing domains is not enough to compare algorithms. Thus, even if two papers examine inference in particular network, they might be answering different queries or assuming different evidence sets. The intent here is to store specific tasks. For example, in inference this might be a specific series of observations/queries. In learning, this might be a particular collection of training sets that have a particular pattern of missing data.
Sharing task evaluation. Even if two researchers examine the same task, they might use different measures to evaluate their algorithms. By sharing evaluation methods, we hope to allow for an objective comparison. In some cases such evaluation methods can be shared programs, such as a program the evaluates the quality of learned model by computing KL divergence to the "real" distribution. In other cases, such an evaluation method might be an agreed upon evaluation of performance, such as space requirements, number of floating point operations, etc.
Organized competitions. One of the dangers of empirical research is that the methods examined become overly tuned to specific evaluation domains. To avoid that danger, it is necessary to use "fresh" problems. The intention is to organize competitions that would address a specific problems, such as causal discovery, on unseen domains.
Plans for the future
Currently, this site contains several domains. The plan is to gradually add other components discussed above.
Please send suggestions and contributions to galel@cs.huji.ac.il.
Acknowledgements
Thanks to Fabio Cozman, Bruce D'Ambrosio, Moises Goldszmidt, David Heckerman, Othar Hansson, Daphne Koller, and Stuart Russell for discussions about the organization of this site. Thanks to John Binder, Jack Breese, David Heckerman, Uffe Kjaeruff, and Mark Peot, for contributing networks.
--------------------------------------------------------------------------------
galel@cs.huji.ac.il
Graphical Models -software tools
Working Group Neural Networks and Fuzzy Systems
Graphical Models
Software Tools back to the main page
Contents
Overview
BayesBuilder
Bayesian Knowledge Discoverer / Bayesware Discoverer
Bayes Net Toolbox
Belief Network Power Constructor
GeNIe / SMILE
Hugin
Netica
Pulcinella
Tetrad
WinMine / MSBN
Overview
On this page we briefly describe some software tools that support reasoning with graphical models and/or inducing them from a database of sample cases. Of course, we do not claim this list to be complete (definitely it is not). Nor does it represent a ranking of the tools, since they are ordered alphabetically. More extensive lists of probabilistic network tools have been compiled by
Russel Almond (an old list, which is not maintained anymore):
http://www.stat.washington.edu/almond/belief.html
Kevin Patrick Murphy:
http://www.cs.berkeley.edu/~murphyk/Bayes/bnsoft.html
and Google:
http://directory.google.com/Top/Computers/Artificial_Intelligence/Belief_Networks/Software/
The Bayesian Network Repository is also a valuable resource. It lists examples of Bayesian networks and datasets, from which they can be learned:
http://www.cs.huji.ac.il/labs/compbio/Repository/
The software we developed in connection with our book is available at:
http://fuzzy.cs.uni-magdeburg.de/books/gm/software.html
Tools for troubleshooting Microsoft products, which are based on Bayesian networks (but do not allow you to access them directly), can be found at
http://support.microsoft.com/support/tshoot/default.asp
back to the top
BayesBuilder
SNN, University of Nijmegen
PO Box 9101, 6500 HB Nijmegen, The Netherlands
http://www.mbfys.kun.nl/snn/Research/bayesbuilder/
BayesBuilder is a tool for (manually) constructing Bayesian networks and drawing inferences with them. It supports neither parameter nor structure learning of Bayesian networks. The graphical user interface of this program is written in Java and is easy to use. However, the program is available only for Windows, because the underlying inference engine is written in C++ and has only been compiled for Windows yet. BayesBuilder is free software.
back to the top
Bayesian Knowledge Discoverer / Bayesware Discoverer
Knowledge Media Institute / Department of Statistics
The Open University
Walton Hall, Milton Keynes MK7 6AA, United Kingdom
http://kmi.open.ac.uk/projects/bkd/
Bayesware Ltd.
http://bayesware.com/
The Bayesian Knowledge Discoverer is a software tool that can learn Bayesian networks from data (structure as well as parameters). The dataset to learn from may contain missing values, which are handled by an approach called "bound and collapse" that is based on probability intervals. The Bayesian Knowledge Discoverer is free software, but it has been succeeded by a commercial version, the Bayesware Discoverer. This program has a nice graphical user interface with some powerful visualization options. A 30 days trial version may be retrieved free of charge. Bayesware Discoverer is available for Windows, Unix and Macintosh.
back to the top
Bayes Net Toolbox
Kevin Patrick Murphy
Department of Computer Science, UC Berkeley
387 Soda Hall, Berkeley, CA 94720-1776, USA
http://www.cs.berkeley.edu/~murphyk/Bayes/bnt.html
The Bayes Net Toolbox is an extension for Matlab, a well-known and widely used mathematical software package. It supports several different algorithms for drawing inferences in Bayesian networks as well as several algorithms for learning the parameters and the structure of Bayesian networks from a dataset of sample cases. It does not have a graphical user interface of its own, but profits from the visualization capabilities of Matlab. The Bayes Net Toolbox is distributed under the Gnu Library General Public License and is available for all systems that can run Matlab, an installation of which is required.
back to the top
Belief Network Power Constructor
Jie Cheng
Dept. of Computing Science, University of Alberta
155 Athabasca Hall, Edmonton, Alberta, Canada T6G 2E1
http://www.cs.ualberta.ca/~jcheng/bnpc.htm
The Bayesian Network Power Constructor uses a three phase algorithm that is based on conditional independence tests to learn the structure of a Bayesian network from data. The conditional independence tests rely on mutual information, which is used to determine whether a (set of) node(s) can reduce or even block the information flow from one node to another. The program comes with a graphical user interface, though a much less advanced one than those of, for instance, HUGIN and Netica (see below). It does not support drawing inferences, but has the advantage that it is free software. It is available only for Windows.
back to the top
GeNIe / SMILE
Decision Systems Laboratory, University of Pittsburgh
B212 SLIS Building, 135 North Bellefield Avenue, Pittsburgh, PA 15260, USA
http://www2.sis.pitt.edu/~genie/
SMILE (Structural Modeling, Inference and Learning Engine) is a library of functions for building Bayesian networks and drawing inferences with them. It does support neither parameter nor structural learning of Bayesian networks. GeNIe (Graphical Network Interface) is a graphical user interface for SMILE, that makes the functions of SMILE easily accessible. While SMILE is platform independent, GeNIe is available only for Windows, since it relies heavily on the Microsoft Foundation classes. Both packages are distributed free of charge.
back to the top
Hugin
Hugin Expert A/S
Niels Jernes Vej 10, 9220 Aalborg, Denmark
http://www.hugin.com
Hugin is one of the oldest and best-known tools for Bayesian network construction and inference. It comes with an easy to use graphical user interface, but also has an API (application programmers interface) for several programming languages, so that the inference engine can be used in other programs. It supports estimating the parameters of a Bayesian network from a dataset of sample cases. In a recent version it has also been extended by a learning algorithm for the structure of a Bayesian network, which is based on conditional independence tests. Hugin is a commercial tool, but a demo version with restricted capabilities may be retrieved free of charge. Hugin is available for Windows and Solaris (Sun Unix).
back to the top
Netica
Norsys Software Corp.
2315 Dunbar Street, Vancouver, BC, Canada V6R 3N1
http://www.norsys.com
Like Hugin, Netica is a commercial tool with an advanced graphical user interface. It supports Bayesian network construction and inference and also comprises an API (application programmers interface) for C++, so that the inference engine may be used in other programs. Netica offers quantitative network learning (known structure, parameter estimation) from a dataset of sample cases, which may contain missing values. It does not support structural learning. A version of Netica with restricted capabilities may be retrieved free of charge, but the price of a full version is also moderate. Netica is available for Windows and Macintosh.
back to the top
Pulcinella
IRIDA, Université Libre de Bruxelles
50, Av. F. Roosevelt, CP 194/6, B-1050 Brussels, Belgium
http://iridia.ulb.ac.be/pulcinella/Welcome.html
Pulcinella is more general than the other programs listed on this page, as it is based on the framework of valuation systems [Shenoy 1992a]. Pulcinella supports reasoning by propagating uncertainty with local computations w.r.t. different uncertainty calculi, but does not support learning graphical models from a dataset of sample cases in any way. The current version of Pulcinella does not have a graphical user interface, but an outdated version of such an interface may be retrieved for Solaris (Sun Unix). Pulcinella is available for Solaris (Sun Unix) and Macintosh, but requires a Common Lisp system.
back to the top
Tetrad
Tetrad Project, Department of Philosophy
Carnegie Mellon University, Pittsburgh, PA, USA
http://hss.cmu.edu/html/departments/philosophy/TETRAD/tetrad.htm
Tetrad is based on the algorithms developed in [Spirtes et al 1993], i.e. on conditional independence test approaches to learn Bayesian networks from data, and, of course, subsequent research in this direction. It can learn the structure as well as the parameters of a Bayesian network from a dataset of sample cases, but does not support drawing inferences. Currently the program is being ported to Java (Tetrad IV). Older versions are available for MSDOS (Tetrad II) and Windows (Tetrad III). Tetrad II is commercial, but available at a moderate fee. Free beta versions are available of Tetrad III and Tetrad IV.
back to the top
WinMine / MSBN
Machine Learning and Statistics Group
Microsoft Research, One Microsoft Way, Redmond, WA 98052-6399, USA
http://research.microsoft.com/~dmax/WinMine/tooldoc.htm
WinMine is a toolkit, i.e. a set of programs for different tasks, rather than an integrated program. Most programs in this toolkit are command line driven, but there is a graphical user interface for the data converter and a network visualization program. WinMine learns the structure and the parameters of Bayesian networks from data and uses decision trees to represent the conditional distributions. It does not support drawing inferences. However, Microsoft Research also offers MSBN (Microsoft Bayesian Networks), a tool for (manually) building Bayesian networks and drawing inferences with them, MSBN comes with a graphical user interface. Both programs, WinMine as well as MSBN, are available for Windows only.
back to the top
© 2002 Christian Borgelt
Last modified: Fri Oct 25 11:05:52 MEST 2002
Graphical Models
Software Tools back to the main page
Contents
Overview
BayesBuilder
Bayesian Knowledge Discoverer / Bayesware Discoverer
Bayes Net Toolbox
Belief Network Power Constructor
GeNIe / SMILE
Hugin
Netica
Pulcinella
Tetrad
WinMine / MSBN
Overview
On this page we briefly describe some software tools that support reasoning with graphical models and/or inducing them from a database of sample cases. Of course, we do not claim this list to be complete (definitely it is not). Nor does it represent a ranking of the tools, since they are ordered alphabetically. More extensive lists of probabilistic network tools have been compiled by
Russel Almond (an old list, which is not maintained anymore):
http://www.stat.washington.edu/almond/belief.html
Kevin Patrick Murphy:
http://www.cs.berkeley.edu/~murphyk/Bayes/bnsoft.html
and Google:
http://directory.google.com/Top/Computers/Artificial_Intelligence/Belief_Networks/Software/
The Bayesian Network Repository is also a valuable resource. It lists examples of Bayesian networks and datasets, from which they can be learned:
http://www.cs.huji.ac.il/labs/compbio/Repository/
The software we developed in connection with our book is available at:
http://fuzzy.cs.uni-magdeburg.de/books/gm/software.html
Tools for troubleshooting Microsoft products, which are based on Bayesian networks (but do not allow you to access them directly), can be found at
http://support.microsoft.com/support/tshoot/default.asp
back to the top
BayesBuilder
SNN, University of Nijmegen
PO Box 9101, 6500 HB Nijmegen, The Netherlands
http://www.mbfys.kun.nl/snn/Research/bayesbuilder/
BayesBuilder is a tool for (manually) constructing Bayesian networks and drawing inferences with them. It supports neither parameter nor structure learning of Bayesian networks. The graphical user interface of this program is written in Java and is easy to use. However, the program is available only for Windows, because the underlying inference engine is written in C++ and has only been compiled for Windows yet. BayesBuilder is free software.
back to the top
Bayesian Knowledge Discoverer / Bayesware Discoverer
Knowledge Media Institute / Department of Statistics
The Open University
Walton Hall, Milton Keynes MK7 6AA, United Kingdom
http://kmi.open.ac.uk/projects/bkd/
Bayesware Ltd.
http://bayesware.com/
The Bayesian Knowledge Discoverer is a software tool that can learn Bayesian networks from data (structure as well as parameters). The dataset to learn from may contain missing values, which are handled by an approach called "bound and collapse" that is based on probability intervals. The Bayesian Knowledge Discoverer is free software, but it has been succeeded by a commercial version, the Bayesware Discoverer. This program has a nice graphical user interface with some powerful visualization options. A 30 days trial version may be retrieved free of charge. Bayesware Discoverer is available for Windows, Unix and Macintosh.
back to the top
Bayes Net Toolbox
Kevin Patrick Murphy
Department of Computer Science, UC Berkeley
387 Soda Hall, Berkeley, CA 94720-1776, USA
http://www.cs.berkeley.edu/~murphyk/Bayes/bnt.html
The Bayes Net Toolbox is an extension for Matlab, a well-known and widely used mathematical software package. It supports several different algorithms for drawing inferences in Bayesian networks as well as several algorithms for learning the parameters and the structure of Bayesian networks from a dataset of sample cases. It does not have a graphical user interface of its own, but profits from the visualization capabilities of Matlab. The Bayes Net Toolbox is distributed under the Gnu Library General Public License and is available for all systems that can run Matlab, an installation of which is required.
back to the top
Belief Network Power Constructor
Jie Cheng
Dept. of Computing Science, University of Alberta
155 Athabasca Hall, Edmonton, Alberta, Canada T6G 2E1
http://www.cs.ualberta.ca/~jcheng/bnpc.htm
The Bayesian Network Power Constructor uses a three phase algorithm that is based on conditional independence tests to learn the structure of a Bayesian network from data. The conditional independence tests rely on mutual information, which is used to determine whether a (set of) node(s) can reduce or even block the information flow from one node to another. The program comes with a graphical user interface, though a much less advanced one than those of, for instance, HUGIN and Netica (see below). It does not support drawing inferences, but has the advantage that it is free software. It is available only for Windows.
back to the top
GeNIe / SMILE
Decision Systems Laboratory, University of Pittsburgh
B212 SLIS Building, 135 North Bellefield Avenue, Pittsburgh, PA 15260, USA
http://www2.sis.pitt.edu/~genie/
SMILE (Structural Modeling, Inference and Learning Engine) is a library of functions for building Bayesian networks and drawing inferences with them. It does support neither parameter nor structural learning of Bayesian networks. GeNIe (Graphical Network Interface) is a graphical user interface for SMILE, that makes the functions of SMILE easily accessible. While SMILE is platform independent, GeNIe is available only for Windows, since it relies heavily on the Microsoft Foundation classes. Both packages are distributed free of charge.
back to the top
Hugin
Hugin Expert A/S
Niels Jernes Vej 10, 9220 Aalborg, Denmark
http://www.hugin.com
Hugin is one of the oldest and best-known tools for Bayesian network construction and inference. It comes with an easy to use graphical user interface, but also has an API (application programmers interface) for several programming languages, so that the inference engine can be used in other programs. It supports estimating the parameters of a Bayesian network from a dataset of sample cases. In a recent version it has also been extended by a learning algorithm for the structure of a Bayesian network, which is based on conditional independence tests. Hugin is a commercial tool, but a demo version with restricted capabilities may be retrieved free of charge. Hugin is available for Windows and Solaris (Sun Unix).
back to the top
Netica
Norsys Software Corp.
2315 Dunbar Street, Vancouver, BC, Canada V6R 3N1
http://www.norsys.com
Like Hugin, Netica is a commercial tool with an advanced graphical user interface. It supports Bayesian network construction and inference and also comprises an API (application programmers interface) for C++, so that the inference engine may be used in other programs. Netica offers quantitative network learning (known structure, parameter estimation) from a dataset of sample cases, which may contain missing values. It does not support structural learning. A version of Netica with restricted capabilities may be retrieved free of charge, but the price of a full version is also moderate. Netica is available for Windows and Macintosh.
back to the top
Pulcinella
IRIDA, Université Libre de Bruxelles
50, Av. F. Roosevelt, CP 194/6, B-1050 Brussels, Belgium
http://iridia.ulb.ac.be/pulcinella/Welcome.html
Pulcinella is more general than the other programs listed on this page, as it is based on the framework of valuation systems [Shenoy 1992a]. Pulcinella supports reasoning by propagating uncertainty with local computations w.r.t. different uncertainty calculi, but does not support learning graphical models from a dataset of sample cases in any way. The current version of Pulcinella does not have a graphical user interface, but an outdated version of such an interface may be retrieved for Solaris (Sun Unix). Pulcinella is available for Solaris (Sun Unix) and Macintosh, but requires a Common Lisp system.
back to the top
Tetrad
Tetrad Project, Department of Philosophy
Carnegie Mellon University, Pittsburgh, PA, USA
http://hss.cmu.edu/html/departments/philosophy/TETRAD/tetrad.htm
Tetrad is based on the algorithms developed in [Spirtes et al 1993], i.e. on conditional independence test approaches to learn Bayesian networks from data, and, of course, subsequent research in this direction. It can learn the structure as well as the parameters of a Bayesian network from a dataset of sample cases, but does not support drawing inferences. Currently the program is being ported to Java (Tetrad IV). Older versions are available for MSDOS (Tetrad II) and Windows (Tetrad III). Tetrad II is commercial, but available at a moderate fee. Free beta versions are available of Tetrad III and Tetrad IV.
back to the top
WinMine / MSBN
Machine Learning and Statistics Group
Microsoft Research, One Microsoft Way, Redmond, WA 98052-6399, USA
http://research.microsoft.com/~dmax/WinMine/tooldoc.htm
WinMine is a toolkit, i.e. a set of programs for different tasks, rather than an integrated program. Most programs in this toolkit are command line driven, but there is a graphical user interface for the data converter and a network visualization program. WinMine learns the structure and the parameters of Bayesian networks from data and uses decision trees to represent the conditional distributions. It does not support drawing inferences. However, Microsoft Research also offers MSBN (Microsoft Bayesian Networks), a tool for (manually) building Bayesian networks and drawing inferences with them, MSBN comes with a graphical user interface. Both programs, WinMine as well as MSBN, are available for Windows only.
back to the top
© 2002 Christian Borgelt
Last modified: Fri Oct 25 11:05:52 MEST 2002
Wednesday, January 27, 2010
Robert Burns 罗伯特·彭斯
Robert Burns 罗伯特·彭斯
Robert Burns
Born in Alloway, Scotland, on January 25, 1759, Robert Burns was the first of William and Agnes Burnes' seven children. His father, a tenant farmer, educated his children at home. Burns also attended one year of mathematics schooling and, between 1765 and 1768, he attended an "adventure" school established by his father and John Murdock. His father died in bankruptcy in 1784, and Burns and his brother Gilbert took over farm. This hard labor later contributed to the heart trouble that Burns' suffered as an adult.
At the age of fifteen, he fell in love and shortly thereafter he wrote his first poem. As a young man, Burns pursued both love and poetry with uncommon zeal. In 1785, he fathered the first of his fourteen children. His biographer, DeLancey Ferguson, had said, "it was not so much that he was conspicuously sinful as that he sinned conspicuously." Between 1784 and 1785, Burns also wrote many of the poems collected in his first book, Poems, Chiefly in the Scottish Dialect, which was printed in 1786 and paid for by subscriptions. This collection was an immediate success and Burns was celebrated throughout England and Scotland as a great "peasant-poet."
In 1788, he and his wife, Jean Armour, settled in Ellisland, where Burns was given a commission as an excise officer. He also began to assist James Johnson in collecting folk songs for an anthology entitled The Scots Musical Museum. Burns' spent the final twelve years of his life editing and imitating traditional folk songs for this volume and for Select Collection of Original Scottish Airs. These volumes were essential in preserving parts of Scotland's cultural heritage and include such well-known songs as "My Luve is Like a Red Red Rose" and "Auld Land Syne." Robert Burns died from heart disease at the age of thirty-seven. On the day of his death, Jean Armour gave birth to his last son, Maxwell.
Most of Burns' poems were written in Scots. They document and celebrate traditional Scottish culture, expressions of farm life, and class and religious distinctions. Burns wrote in a variety of forms: epistles to friends, ballads, and songs. His best-known poem is the mock-heroic Tam o' Shanter. He is also well known for the over three hundred songs he wrote which celebrate love, friendship, work, and drink with often hilarious and tender sympathy. Even today, he is often referred to as the National Bard of Scotland.
外部链接:彭斯官方网:http://www.robertburns.org/
英国诗人 。1759年1月25日生于苏格兰艾尔郡阿洛韦镇的一个佃农家庭,1796年7月21日卒于邓弗里斯。自幼家境贫寒,未受过正规教育,靠自学获得多方面的知识。最优秀的诗歌作品产生于1785~1790年 ,收集在诗集《主要以苏格兰方言而写的诗》中。诗集体现了诗人一反当时英国诗坛的新古典主义诗风,从地方生活和民间文学中汲取营养,为诗歌创作带来了新鲜的活力,形成了他诗歌创作的基本特色。以虔诚的感情歌颂大自然及乡村生活;以入木三分的犀利言辞讽刺教会及日常生活中人们的虚伪。诗集使彭斯一举成名,被称为天才的农夫。后应邀到爱丁堡,出入于上流社会的显贵中间。但发现自己高傲的天性和激进思想与上流社会格格不入,乃返回故乡务农。一度到苏格兰北部高原地区游历,后来当了税务官,一边任职一边创作。
彭斯的诗歌作品多使用苏格兰方言,并多为抒情短诗,如歌颂爱情的名篇《我的爱人像朵红红的玫瑰》和抒发爱国热情的《苏格兰人》等。他还创作了不少讽刺诗(如《威利长老的祈祷》),诗札(如《致拉布雷克书》)和叙事诗(如《两只狗》和《快活的乞丐》)。作品表达了平民阶级的思想感情,同情下层人民疾苦,同时以健康、自然的方式体现了追求“美酒、女人和歌”的快乐主义人生哲学。彭斯富有敏锐的幽默感。对苏格兰乡村生活的生动描写使他的诗歌作品具有民族特色和艺术魅力。
除诗歌创作外,彭斯还收集整理大量的苏格兰民间歌谣,编辑出版了6卷本的《苏格兰音乐总汇》和8卷本的《原始的苏格兰歌曲选集》。其中《友谊地久天长》不仅享誉苏格兰,而且闻名世界。
在地球的各个角落,在亲朋的离别或是会议的告别仪式,人们以各种不同语言齐唱《友谊地久天长》(Auld Lang Syne,又名《骊歌》),朋友们紧紧挽着手,歌唱永不相忘的友谊。它驱走了人们离别的哀愁,使人们满怀激情各奔前程。这首家喻户晓的苏格兰民歌的词作者,即是著名的民族诗人罗伯特·彭斯(Robert Burns,1759-1796)。
彭斯出生在一个贫苦农民家庭,以租地耕种为生。幼时在苏格兰家乡附近上小学。不久校长离去,父亲请老师来家教学。老师认为彭斯兄弟不比年长的同学差。父亲晚上教他们文法及神学。12岁时彭斯二兄弟又轮流去离家四英里的村落上学,14岁在学习英语之余,开始学习法文。
彭斯15岁时成为父亲身边主要的劳动力,驾驭马匹在土墩及洼地上耕作。劳动极为艰辛。虽数次更换土地租耕,因土地贫瘠,收获仍然不佳。劳动之余,彭斯爱读苏格兰诗人申斯通(Shenstone,1714-1793)、蒲柏(Pope,1688-1744)及弗格森(Fergusson 1750-1774)的作品,也浏览苏格兰小说家麦肯齐(MacKenzie,1745-1831)的书。他希望能成为苏格兰艾尔郡(Ayrshire)的诗人,歌唱故乡的山河。
1784年他的父亲去世,全家迁去莫斯吉尔(Mossgiel),耕作收获并无好转。幸而他的主要用苏格兰方言写的诗集得以出版并迅即获得成功,爱丁堡的出版商又很快为之再版。编辑文学杂志的麦肯齐在评论中称赞这位庄稼汉是位诗歌天才。
于是在爱丁堡,彭斯穿起深色大衣、浅背心、皱边的衬衫,足登鹿皮鞋或长靴,过起出入文学集会及酒馆的双重生活。
彭斯在爱丁堡生活及游历苏格兰一段时期后,仍回乡务农。1788年他考取税务局职员,除了在农田干活外,还要每周在马背上驰骋200英里去上班。执法时,他不放过大鱼,但对贫穷者则手下留情。他为了全力做好税务局的工作,1791年放弃农活迁往邓弗里斯(Dumfries),在那儿度过了他的最后的岁月。
彭斯写了大量的抒情诗,还写讽刺诗及叙事诗。他也喜爱歌曲,有敏锐的音乐耳朵,对节奏有良好的反应。他把最后十年的精力,主要放在为二个丛刊的整理及收集民歌上,使濒临失传的三百多首民歌得以保存下来。
1796年他患了风湿性关节炎及心脏病,于同年7月21日英年早逝。前来送葬的多达二万人。
当年彭斯出生并度过了七年童年的茅屋,位于艾尔郡的阿洛韦镇(Alloway),现由彭斯纪念碑信托基金机构管理。与茅屋相连接的红瓦顶、前面为长廊及花园的博物馆,为信托机构理事会于1920年所扩建。1994年该理事会重铺稻草屋顶,再建18世纪的菜园及石堤。这就是现在世界著名的彭斯茅舍。
彭斯雕像位于艾尔市中心。从市中心附近坐开往杜恩河老桥(Auld Brig O’Doon)的公共汽车,在终点老桥前一站下车,即见到茅草顶的白色平房,木制的门窗是深棕色的。门上方的黑色纪念板上写着“彭斯茅舍”,接着是“罗伯特·彭斯—艾尔郡诗人”及他的生卒年月。进门后先是谷仓,然后是牛棚及马厩。依稀传来牲畜的叫声,蜡制的耕牛旁还有几只母鸡在啄食谷粒。起居室中以蜡像布置一家人当时融洽的情景。父亲在烛光下读《圣经》,母亲抱着妹妹坐在对面,弟弟坐在一边,彭斯则光着脚站在一旁专心听讲。一个小妹妹躺在摇篮里。厨房里熏黑的炉灶还生着火,彭斯出生的床即在厨房内。布置一如当年。幼年时母亲在这里教孩子们唱苏格兰民歌,姨母则介绍给他们大量有关鬼怪神仙的故事和歌曲。“彭斯茅屋”对崇拜彭斯的人来说是圣地,但也是当年贫苦农家家居生活的写照。
彭斯长期生活在农村,从事繁重的农活。地主的剥削,加上土地的贫瘠,欠收、负债、迁居……使他常常过着没有温饱的生活。但他热爱生活,对劳动人民有深厚的感情。在《两只狗》这首诗里,通过两只分属贫富人家的狗之间的对话,描绘了地主家的骄奢淫逸。贫穷的佃户虽然耕作及劳动辛苦,但同欢共乐聚在一起。而这两家的狗能够融洽相处,成为人类不公平生活的鲜明对照。
诗人也理解农民对牲口的深厚感情。他在《新年早晨老农向老马麦琪致辞》一诗中,回顾了老马一生的辛劳后,写道:“我将在留下的麦地上面,把你的缰绳系好,不用费大力气,你就在那边舒舒畅畅吃个饱。”
与彭斯茅屋相通连的彭斯博物馆,收藏了彭斯珍贵的手稿、他的包括早期版本的作品、有关的画像等,有些收藏品来自美国、加拿大甚至南非。
大展览室介绍他一生的劳动、写作和生活。以图片的形式,配合他的诗句、信件或日记,生动地叙述了他当年的经历。这里还展出了他的怀表、记事本、墨水瓶、鼻烟壶,两把柄上刻有“R.B.”的手枪,以及当税务员测酒用的长棒,也展有1786年版的主要用苏格兰方言写的诗集。笔者自然记得寻找《友谊地久天长》的原稿,它原来出自彭斯于1788年写给友人的一封信中。
第二室展出数幅著名的油画。《羊杂宴》描绘彭斯夫妇款待客人的场景。彭斯喜欢这种热闹场面。另有一组四幅的版画,描绘他的作品《汤姆·奥桑特的故事》(Tam o’Shanter),彭斯这部根据民间传说写的长诗,讲的是汤姆深夜回家途中遇鬼的故事。他去除了传说中迷信的成分,以喜剧形式讲魔法,寓有深意。同时,它也把儿时听到的传说,与故乡阿罗韦他幼时熟悉的界标、陈旧的教堂、古老的石桥和石冢等联系起来,具有沧桑感和神秘感。
离开这里步行一里,抵达老杜恩河桥,彭斯纪念碑即位于附近的山丘上。这座希腊式建筑由爱丁堡著名建筑师设计,1823年完成,耗资3247镑。登上这座台式纪念碑,可眺望杜恩河及卡里克山(Carrick Hill)的优美景色。纪念碑的基座建有展览室,展出15种外国文字的彭斯著作。在附近花园里,还建有一个雕像室,内有三座《汤姆·奥桑特的故事》中的人物雕像,真人大小,造型风趣。
归途中于老杜恩河老桥公共汽车站,见到一家大百货公司,里面的多种商品以彭斯命名。如果时间合适(回到艾尔的公共汽车每小时一班),还可看一看介绍彭斯的记录影片。
在他住过的基尔马诺克(Kilmarnock)及邓弗里斯也建有彭斯博物馆、雕像或纪念碑,欧文(lrving)也有他的雕像。甚至远在加拿大和澳大利亚,也有他的纪念碑。位于苏格兰首府爱丁堡的“三作家博物馆”介绍了彭斯、司各特(Walter Scott,1771-1832)及斯蒂文森(Robert L.Stevenson, 1850-1894)的生平,也值得一去。
在彭斯的故乡苏格兰,有数千个彭斯联谊会,苏格兰各地每年都庆祝他的生日。
如此广受故乡人民爱戴的诗人,在世界上也不多见。因为除了长期生活在农村并写出描绘故乡及朴直人民的诗歌外,身受民族压迫的他十分热爱苏格兰,并热情歌颂民主及自由。
在彭斯的青年时代,先后爆发了美国独立战争和法国大革命。他关心世界政治及苏格兰祖国的命运。他写的《华盛顿将军生辰颂》,赞扬美国人民的独立斗争。在法国大革命的影响下,他写了《自由树》和《苏格兰人》两首著名长诗。《自由树》声言有了法兰西这棵自由树,人类将变得平等,世界将获得和平。《苏格兰人》重温历史,以颂扬早年民族英雄华莱士等人的事迹来激励人民:
谁愿为苏格兰国君和法律,
奋力把自由之剑拔出?
生为自由人,死为自由魂,
让他跟我前进!
彭斯是人民的诗人,也是为自由而斗争的战士。这颗明亮的星,永远闪耀在苏格兰的上空,也永远闪耀在爱好和平与友谊的人们心中。
【作品选译】
一朵红红的玫瑰 【英文朗诵:下载地址】
啊,我的爱人象朵红红的玫瑰,
六月里迎风初开,
啊,我的爱人象支甜甜的曲子,
奏得合拍又和谐。
我的好姑娘,多么美丽的人儿!
请看我,多么深挚的爱情!
亲爱的,我永远爱你,
纵使大海干涸水流尽。
纵使大海千涸水流尽,
太阳将岩石烧作灰尘,
亲爱的,我永远爱你,
只要我一息犹存。
珍重吧,我唯一的爱人,
珍重吧,让我们暂时别离,
但我定要回来,
哪怕千里万里!
王佐良译
往昔的时光
老朋友哪能遗忘,
哪能不放在心上?
老朋友哪能遗忘,
还有往昔的时光?
为了往昔的时光,老朋友,
为了往昔的时光,
再干一杯友情的酒,
为了往昔的时光,
你来痛饮一大杯,
我也买酒来相陪。
干一杯友情的酒又何妨?
为了往昔的时光。
我们曾邀游山岗,
到处将野花拜访。
但以后走上疲惫的旅程,
逝去了往昔的时光!
我们曾赤脚瞠过河流,
水声笑语里将时间忘。
如今大海的怒涛把我们隔开,
逝去了往昔的时光!
忠实的老友,伸出你的手,
让我们握手聚一堂,
再来痛饮—杯欢乐酒,
为了往昔的时光!
王佐良译
给我开门,哦!
曲调:轻轻地开门
哦,开门,纵使你对戬无情,
也表一点怜悯,哦。
你虽变了心,我仍忠于糟.
哦,给我开门,哦。
风吹我苍白的双颊,好冷!
但冷不过你对我的心,哦.
冰霜使我心血凝冻,
也没你给我的痛深,哦。
残月沉落白水中,
时间也随我沉落,哦。
假朋友,变心人,永别不再逢!
我决不再采烦渎,哦。
她把门儿大敞开,
见了平地上苍白的尸体,哦,
只喊了一声“爱’就倒在尘埃,
从此再也不起,哦。
王佐良译
走过麦田来
(合唱)啊,珍尼是可怜的人儿,
珍尼哭得悲哀。
她拖着长裙,
走过麦田来。
可怜的人儿,走过麦田来,
走过麦田来,
她拖着长裙
走过麦田来。
如果一个他碰见一个她,
走过麦田来,
如果一个他吻了一个她,
她何必哭起来?
如果一个他碰见一个她
走过山间小道,
如果一个他吻了一个她,
别人哪用知道!
(合唱)啊,珍尼是可怜的人儿,
珍尼哭得悲哀。
她拖着长裙,
走过麦田来。
王佐良译
如果你站在冷风里
呵,如果你站在冷风里,
一人在草地,在草地,
我的小屋会挡住凶恶的风,
保护你,保护你。
如果灾难象风暴袭来,
落在你头上,你头上,
我将用胸脯温暖你,
一切同享,一切同当。
如果我站在可怕的荒野,
天黑又把路迷,把路迷,
就是沙漠也变成天堂,
只要有你,只要有你。
如果我是地球的君王,
宝座我们共有,我们共有,
我的王冠上有一粒最亮的珍珠——
它是我的王后,我的王后。
王佐良译
选自《彭斯诗选》,人民文学出版社(1959)
苏格兰人①
跟华莱士流过血的苏格兰人,
随布鲁斯作过战的苏格兰人,
起来!倒在血泊里也成——
要不就夺取胜利!
时刻已到,决战已近,
前线的军情吃紧,
骄横的爱德华在统兵入侵——
带来锁链,带来奴役!
谁愿卖国求荣?
谁愿爬进懦夫的坟茔?
谁卑鄙到宁做奴隶偷生?——
让他走,让他逃避!
谁愿将苏格兰国王和法律保护,
拔出自由之剑来痛击、猛舞?
谁愿生作自由人,死作自由魂?——
让他来,跟我出击!
凭被压迫者的苦难采起誓,
凭你们受奴役的子孙来起誓,
我们决心流血到死——
但他们必须自由!
打倒骄横的篡位者!
死一个敌人,少一个暴君!
多一次攻击,添一分自由!
动手——要不就断头!
袁可嘉译
①这是彭斯所作爱国诗中最著名的一首,写的是苏格兰
国王罗伯特·布鲁斯在大破英国侵略军的班诺克本一役
(1314)之前向部队所作的号召。首先发表在1794年6月的
《纪事晨报》。
诗中所提的华莱士是一位十三世纪的英格兰民族英雄,
也曾大败英军。但后来为奸人出卖,被执处死。爱德华指
英王爱德华二世。
彭斯一直念念不忘为苏格兰民族独立而斗争的志士,
写此诗时爱国热情尤其澎湃。不仅如此,他还借古讽今,
曾经明白写信告诉朋友说:启发他写这首诗的不止是古代
那场“光荣的争取自由的斗争”,而还有“在时间上却不
是那么遥远的同类性质的斗争”,即法国大革命,当时正
方兴未艾,在苏格兰的彼岸如火如荼地展开。
我的心儿在高原①
我的心儿在高原,我的心不在这儿,
我的心儿在高原,迫遂着鹿儿。
追逐着野鹿,跟踪着獐儿;
我的心儿在高原,不管我上哪儿,
别了啊高原,别了啊北国,
英雄的家乡,可敬的故国,
不管我上哪儿漂荡,我上哪儿遨游,
我永远爱着高原的山丘。
别了啊,高耸的积雪的山岳,
别了啊,山下的溪壑和翠谷,
别了啊,森林和枝檀纵横的树林,
别了啊,急川和洪流的轰鸣,
我的心儿在高原,我的心不在这儿,
我的心儿在高原,追逐着鹿儿,
追逐着野鹿,跟踪着獐儿,
我的心儿在高原,不管我上哪儿。
袁可嘉译
①苏格兰北部地区。
Robert Burns
Born in Alloway, Scotland, on January 25, 1759, Robert Burns was the first of William and Agnes Burnes' seven children. His father, a tenant farmer, educated his children at home. Burns also attended one year of mathematics schooling and, between 1765 and 1768, he attended an "adventure" school established by his father and John Murdock. His father died in bankruptcy in 1784, and Burns and his brother Gilbert took over farm. This hard labor later contributed to the heart trouble that Burns' suffered as an adult.
At the age of fifteen, he fell in love and shortly thereafter he wrote his first poem. As a young man, Burns pursued both love and poetry with uncommon zeal. In 1785, he fathered the first of his fourteen children. His biographer, DeLancey Ferguson, had said, "it was not so much that he was conspicuously sinful as that he sinned conspicuously." Between 1784 and 1785, Burns also wrote many of the poems collected in his first book, Poems, Chiefly in the Scottish Dialect, which was printed in 1786 and paid for by subscriptions. This collection was an immediate success and Burns was celebrated throughout England and Scotland as a great "peasant-poet."
In 1788, he and his wife, Jean Armour, settled in Ellisland, where Burns was given a commission as an excise officer. He also began to assist James Johnson in collecting folk songs for an anthology entitled The Scots Musical Museum. Burns' spent the final twelve years of his life editing and imitating traditional folk songs for this volume and for Select Collection of Original Scottish Airs. These volumes were essential in preserving parts of Scotland's cultural heritage and include such well-known songs as "My Luve is Like a Red Red Rose" and "Auld Land Syne." Robert Burns died from heart disease at the age of thirty-seven. On the day of his death, Jean Armour gave birth to his last son, Maxwell.
Most of Burns' poems were written in Scots. They document and celebrate traditional Scottish culture, expressions of farm life, and class and religious distinctions. Burns wrote in a variety of forms: epistles to friends, ballads, and songs. His best-known poem is the mock-heroic Tam o' Shanter. He is also well known for the over three hundred songs he wrote which celebrate love, friendship, work, and drink with often hilarious and tender sympathy. Even today, he is often referred to as the National Bard of Scotland.
外部链接:彭斯官方网:http://www.robertburns.org/
英国诗人 。1759年1月25日生于苏格兰艾尔郡阿洛韦镇的一个佃农家庭,1796年7月21日卒于邓弗里斯。自幼家境贫寒,未受过正规教育,靠自学获得多方面的知识。最优秀的诗歌作品产生于1785~1790年 ,收集在诗集《主要以苏格兰方言而写的诗》中。诗集体现了诗人一反当时英国诗坛的新古典主义诗风,从地方生活和民间文学中汲取营养,为诗歌创作带来了新鲜的活力,形成了他诗歌创作的基本特色。以虔诚的感情歌颂大自然及乡村生活;以入木三分的犀利言辞讽刺教会及日常生活中人们的虚伪。诗集使彭斯一举成名,被称为天才的农夫。后应邀到爱丁堡,出入于上流社会的显贵中间。但发现自己高傲的天性和激进思想与上流社会格格不入,乃返回故乡务农。一度到苏格兰北部高原地区游历,后来当了税务官,一边任职一边创作。
彭斯的诗歌作品多使用苏格兰方言,并多为抒情短诗,如歌颂爱情的名篇《我的爱人像朵红红的玫瑰》和抒发爱国热情的《苏格兰人》等。他还创作了不少讽刺诗(如《威利长老的祈祷》),诗札(如《致拉布雷克书》)和叙事诗(如《两只狗》和《快活的乞丐》)。作品表达了平民阶级的思想感情,同情下层人民疾苦,同时以健康、自然的方式体现了追求“美酒、女人和歌”的快乐主义人生哲学。彭斯富有敏锐的幽默感。对苏格兰乡村生活的生动描写使他的诗歌作品具有民族特色和艺术魅力。
除诗歌创作外,彭斯还收集整理大量的苏格兰民间歌谣,编辑出版了6卷本的《苏格兰音乐总汇》和8卷本的《原始的苏格兰歌曲选集》。其中《友谊地久天长》不仅享誉苏格兰,而且闻名世界。
在地球的各个角落,在亲朋的离别或是会议的告别仪式,人们以各种不同语言齐唱《友谊地久天长》(Auld Lang Syne,又名《骊歌》),朋友们紧紧挽着手,歌唱永不相忘的友谊。它驱走了人们离别的哀愁,使人们满怀激情各奔前程。这首家喻户晓的苏格兰民歌的词作者,即是著名的民族诗人罗伯特·彭斯(Robert Burns,1759-1796)。
彭斯出生在一个贫苦农民家庭,以租地耕种为生。幼时在苏格兰家乡附近上小学。不久校长离去,父亲请老师来家教学。老师认为彭斯兄弟不比年长的同学差。父亲晚上教他们文法及神学。12岁时彭斯二兄弟又轮流去离家四英里的村落上学,14岁在学习英语之余,开始学习法文。
彭斯15岁时成为父亲身边主要的劳动力,驾驭马匹在土墩及洼地上耕作。劳动极为艰辛。虽数次更换土地租耕,因土地贫瘠,收获仍然不佳。劳动之余,彭斯爱读苏格兰诗人申斯通(Shenstone,1714-1793)、蒲柏(Pope,1688-1744)及弗格森(Fergusson 1750-1774)的作品,也浏览苏格兰小说家麦肯齐(MacKenzie,1745-1831)的书。他希望能成为苏格兰艾尔郡(Ayrshire)的诗人,歌唱故乡的山河。
1784年他的父亲去世,全家迁去莫斯吉尔(Mossgiel),耕作收获并无好转。幸而他的主要用苏格兰方言写的诗集得以出版并迅即获得成功,爱丁堡的出版商又很快为之再版。编辑文学杂志的麦肯齐在评论中称赞这位庄稼汉是位诗歌天才。
于是在爱丁堡,彭斯穿起深色大衣、浅背心、皱边的衬衫,足登鹿皮鞋或长靴,过起出入文学集会及酒馆的双重生活。
彭斯在爱丁堡生活及游历苏格兰一段时期后,仍回乡务农。1788年他考取税务局职员,除了在农田干活外,还要每周在马背上驰骋200英里去上班。执法时,他不放过大鱼,但对贫穷者则手下留情。他为了全力做好税务局的工作,1791年放弃农活迁往邓弗里斯(Dumfries),在那儿度过了他的最后的岁月。
彭斯写了大量的抒情诗,还写讽刺诗及叙事诗。他也喜爱歌曲,有敏锐的音乐耳朵,对节奏有良好的反应。他把最后十年的精力,主要放在为二个丛刊的整理及收集民歌上,使濒临失传的三百多首民歌得以保存下来。
1796年他患了风湿性关节炎及心脏病,于同年7月21日英年早逝。前来送葬的多达二万人。
当年彭斯出生并度过了七年童年的茅屋,位于艾尔郡的阿洛韦镇(Alloway),现由彭斯纪念碑信托基金机构管理。与茅屋相连接的红瓦顶、前面为长廊及花园的博物馆,为信托机构理事会于1920年所扩建。1994年该理事会重铺稻草屋顶,再建18世纪的菜园及石堤。这就是现在世界著名的彭斯茅舍。
彭斯雕像位于艾尔市中心。从市中心附近坐开往杜恩河老桥(Auld Brig O’Doon)的公共汽车,在终点老桥前一站下车,即见到茅草顶的白色平房,木制的门窗是深棕色的。门上方的黑色纪念板上写着“彭斯茅舍”,接着是“罗伯特·彭斯—艾尔郡诗人”及他的生卒年月。进门后先是谷仓,然后是牛棚及马厩。依稀传来牲畜的叫声,蜡制的耕牛旁还有几只母鸡在啄食谷粒。起居室中以蜡像布置一家人当时融洽的情景。父亲在烛光下读《圣经》,母亲抱着妹妹坐在对面,弟弟坐在一边,彭斯则光着脚站在一旁专心听讲。一个小妹妹躺在摇篮里。厨房里熏黑的炉灶还生着火,彭斯出生的床即在厨房内。布置一如当年。幼年时母亲在这里教孩子们唱苏格兰民歌,姨母则介绍给他们大量有关鬼怪神仙的故事和歌曲。“彭斯茅屋”对崇拜彭斯的人来说是圣地,但也是当年贫苦农家家居生活的写照。
彭斯长期生活在农村,从事繁重的农活。地主的剥削,加上土地的贫瘠,欠收、负债、迁居……使他常常过着没有温饱的生活。但他热爱生活,对劳动人民有深厚的感情。在《两只狗》这首诗里,通过两只分属贫富人家的狗之间的对话,描绘了地主家的骄奢淫逸。贫穷的佃户虽然耕作及劳动辛苦,但同欢共乐聚在一起。而这两家的狗能够融洽相处,成为人类不公平生活的鲜明对照。
诗人也理解农民对牲口的深厚感情。他在《新年早晨老农向老马麦琪致辞》一诗中,回顾了老马一生的辛劳后,写道:“我将在留下的麦地上面,把你的缰绳系好,不用费大力气,你就在那边舒舒畅畅吃个饱。”
与彭斯茅屋相通连的彭斯博物馆,收藏了彭斯珍贵的手稿、他的包括早期版本的作品、有关的画像等,有些收藏品来自美国、加拿大甚至南非。
大展览室介绍他一生的劳动、写作和生活。以图片的形式,配合他的诗句、信件或日记,生动地叙述了他当年的经历。这里还展出了他的怀表、记事本、墨水瓶、鼻烟壶,两把柄上刻有“R.B.”的手枪,以及当税务员测酒用的长棒,也展有1786年版的主要用苏格兰方言写的诗集。笔者自然记得寻找《友谊地久天长》的原稿,它原来出自彭斯于1788年写给友人的一封信中。
第二室展出数幅著名的油画。《羊杂宴》描绘彭斯夫妇款待客人的场景。彭斯喜欢这种热闹场面。另有一组四幅的版画,描绘他的作品《汤姆·奥桑特的故事》(Tam o’Shanter),彭斯这部根据民间传说写的长诗,讲的是汤姆深夜回家途中遇鬼的故事。他去除了传说中迷信的成分,以喜剧形式讲魔法,寓有深意。同时,它也把儿时听到的传说,与故乡阿罗韦他幼时熟悉的界标、陈旧的教堂、古老的石桥和石冢等联系起来,具有沧桑感和神秘感。
离开这里步行一里,抵达老杜恩河桥,彭斯纪念碑即位于附近的山丘上。这座希腊式建筑由爱丁堡著名建筑师设计,1823年完成,耗资3247镑。登上这座台式纪念碑,可眺望杜恩河及卡里克山(Carrick Hill)的优美景色。纪念碑的基座建有展览室,展出15种外国文字的彭斯著作。在附近花园里,还建有一个雕像室,内有三座《汤姆·奥桑特的故事》中的人物雕像,真人大小,造型风趣。
归途中于老杜恩河老桥公共汽车站,见到一家大百货公司,里面的多种商品以彭斯命名。如果时间合适(回到艾尔的公共汽车每小时一班),还可看一看介绍彭斯的记录影片。
在他住过的基尔马诺克(Kilmarnock)及邓弗里斯也建有彭斯博物馆、雕像或纪念碑,欧文(lrving)也有他的雕像。甚至远在加拿大和澳大利亚,也有他的纪念碑。位于苏格兰首府爱丁堡的“三作家博物馆”介绍了彭斯、司各特(Walter Scott,1771-1832)及斯蒂文森(Robert L.Stevenson, 1850-1894)的生平,也值得一去。
在彭斯的故乡苏格兰,有数千个彭斯联谊会,苏格兰各地每年都庆祝他的生日。
如此广受故乡人民爱戴的诗人,在世界上也不多见。因为除了长期生活在农村并写出描绘故乡及朴直人民的诗歌外,身受民族压迫的他十分热爱苏格兰,并热情歌颂民主及自由。
在彭斯的青年时代,先后爆发了美国独立战争和法国大革命。他关心世界政治及苏格兰祖国的命运。他写的《华盛顿将军生辰颂》,赞扬美国人民的独立斗争。在法国大革命的影响下,他写了《自由树》和《苏格兰人》两首著名长诗。《自由树》声言有了法兰西这棵自由树,人类将变得平等,世界将获得和平。《苏格兰人》重温历史,以颂扬早年民族英雄华莱士等人的事迹来激励人民:
谁愿为苏格兰国君和法律,
奋力把自由之剑拔出?
生为自由人,死为自由魂,
让他跟我前进!
彭斯是人民的诗人,也是为自由而斗争的战士。这颗明亮的星,永远闪耀在苏格兰的上空,也永远闪耀在爱好和平与友谊的人们心中。
【作品选译】
一朵红红的玫瑰 【英文朗诵:下载地址】
啊,我的爱人象朵红红的玫瑰,
六月里迎风初开,
啊,我的爱人象支甜甜的曲子,
奏得合拍又和谐。
我的好姑娘,多么美丽的人儿!
请看我,多么深挚的爱情!
亲爱的,我永远爱你,
纵使大海干涸水流尽。
纵使大海千涸水流尽,
太阳将岩石烧作灰尘,
亲爱的,我永远爱你,
只要我一息犹存。
珍重吧,我唯一的爱人,
珍重吧,让我们暂时别离,
但我定要回来,
哪怕千里万里!
王佐良译
往昔的时光
老朋友哪能遗忘,
哪能不放在心上?
老朋友哪能遗忘,
还有往昔的时光?
为了往昔的时光,老朋友,
为了往昔的时光,
再干一杯友情的酒,
为了往昔的时光,
你来痛饮一大杯,
我也买酒来相陪。
干一杯友情的酒又何妨?
为了往昔的时光。
我们曾邀游山岗,
到处将野花拜访。
但以后走上疲惫的旅程,
逝去了往昔的时光!
我们曾赤脚瞠过河流,
水声笑语里将时间忘。
如今大海的怒涛把我们隔开,
逝去了往昔的时光!
忠实的老友,伸出你的手,
让我们握手聚一堂,
再来痛饮—杯欢乐酒,
为了往昔的时光!
王佐良译
给我开门,哦!
曲调:轻轻地开门
哦,开门,纵使你对戬无情,
也表一点怜悯,哦。
你虽变了心,我仍忠于糟.
哦,给我开门,哦。
风吹我苍白的双颊,好冷!
但冷不过你对我的心,哦.
冰霜使我心血凝冻,
也没你给我的痛深,哦。
残月沉落白水中,
时间也随我沉落,哦。
假朋友,变心人,永别不再逢!
我决不再采烦渎,哦。
她把门儿大敞开,
见了平地上苍白的尸体,哦,
只喊了一声“爱’就倒在尘埃,
从此再也不起,哦。
王佐良译
走过麦田来
(合唱)啊,珍尼是可怜的人儿,
珍尼哭得悲哀。
她拖着长裙,
走过麦田来。
可怜的人儿,走过麦田来,
走过麦田来,
她拖着长裙
走过麦田来。
如果一个他碰见一个她,
走过麦田来,
如果一个他吻了一个她,
她何必哭起来?
如果一个他碰见一个她
走过山间小道,
如果一个他吻了一个她,
别人哪用知道!
(合唱)啊,珍尼是可怜的人儿,
珍尼哭得悲哀。
她拖着长裙,
走过麦田来。
王佐良译
如果你站在冷风里
呵,如果你站在冷风里,
一人在草地,在草地,
我的小屋会挡住凶恶的风,
保护你,保护你。
如果灾难象风暴袭来,
落在你头上,你头上,
我将用胸脯温暖你,
一切同享,一切同当。
如果我站在可怕的荒野,
天黑又把路迷,把路迷,
就是沙漠也变成天堂,
只要有你,只要有你。
如果我是地球的君王,
宝座我们共有,我们共有,
我的王冠上有一粒最亮的珍珠——
它是我的王后,我的王后。
王佐良译
选自《彭斯诗选》,人民文学出版社(1959)
苏格兰人①
跟华莱士流过血的苏格兰人,
随布鲁斯作过战的苏格兰人,
起来!倒在血泊里也成——
要不就夺取胜利!
时刻已到,决战已近,
前线的军情吃紧,
骄横的爱德华在统兵入侵——
带来锁链,带来奴役!
谁愿卖国求荣?
谁愿爬进懦夫的坟茔?
谁卑鄙到宁做奴隶偷生?——
让他走,让他逃避!
谁愿将苏格兰国王和法律保护,
拔出自由之剑来痛击、猛舞?
谁愿生作自由人,死作自由魂?——
让他来,跟我出击!
凭被压迫者的苦难采起誓,
凭你们受奴役的子孙来起誓,
我们决心流血到死——
但他们必须自由!
打倒骄横的篡位者!
死一个敌人,少一个暴君!
多一次攻击,添一分自由!
动手——要不就断头!
袁可嘉译
①这是彭斯所作爱国诗中最著名的一首,写的是苏格兰
国王罗伯特·布鲁斯在大破英国侵略军的班诺克本一役
(1314)之前向部队所作的号召。首先发表在1794年6月的
《纪事晨报》。
诗中所提的华莱士是一位十三世纪的英格兰民族英雄,
也曾大败英军。但后来为奸人出卖,被执处死。爱德华指
英王爱德华二世。
彭斯一直念念不忘为苏格兰民族独立而斗争的志士,
写此诗时爱国热情尤其澎湃。不仅如此,他还借古讽今,
曾经明白写信告诉朋友说:启发他写这首诗的不止是古代
那场“光荣的争取自由的斗争”,而还有“在时间上却不
是那么遥远的同类性质的斗争”,即法国大革命,当时正
方兴未艾,在苏格兰的彼岸如火如荼地展开。
我的心儿在高原①
我的心儿在高原,我的心不在这儿,
我的心儿在高原,迫遂着鹿儿。
追逐着野鹿,跟踪着獐儿;
我的心儿在高原,不管我上哪儿,
别了啊高原,别了啊北国,
英雄的家乡,可敬的故国,
不管我上哪儿漂荡,我上哪儿遨游,
我永远爱着高原的山丘。
别了啊,高耸的积雪的山岳,
别了啊,山下的溪壑和翠谷,
别了啊,森林和枝檀纵横的树林,
别了啊,急川和洪流的轰鸣,
我的心儿在高原,我的心不在这儿,
我的心儿在高原,追逐着鹿儿,
追逐着野鹿,跟踪着獐儿,
我的心儿在高原,不管我上哪儿。
袁可嘉译
①苏格兰北部地区。
Tuesday, January 19, 2010
Bonferroni correction in SPSS
ANOVA with SPSS
Never, ever, run any statistical test without performing EDA first!
What's wrong with t-tests?
Nothing, except ...
If you want to compare three or more groups using t-tests with the usual 0.05 level of significance, you would have to compare the three groups pairwise (A to B, A to C, B to C), so the chance of getting the wrong result would be:
1 - (0.95 x 0.95 x 0.95) = 14.3%
If you wanted to compare four or more groups, the chance of getting the wrong result would be (0.95)6 = 26%, and for five groups, 40%. Not good, is it? So we use ANOVA. Never perform multiple t-tests: Anyone on this module discovered performing multiple t-tests when they should use ANOVA will be shot!
ANalysis Of VAriance (ANOVA) is such an important statistical method that it would be easy to spend a whole module on this test alone. Like the t-test, ANOVA is a parametric test which assumes:
•data is numerical data representing samples from normally distributed populations
•the variances of the groups are "similar"
•the sizes of the groups are "similar"
•the groups should be independent
so it's important to carry out EDA before starting AVOVA! In fact, ANOVA is quite a robust procedure, so as long as the groups are similar, the test is normally reliable.
ANOVA tests the null hypothesis that the means of all the groups being compared are equal, and produces a statistic called F which is equivalent to the t-statistic from a t-test. But there's a catch. If the means of all the groups tested by ANOVA are equal, fine. But if the result tells us to reject the null hypothesis, we still don't know which of the means differ. We solve this problem by performing what is known as a "post hoc" (after the event) test.
Reminder:
•Independent variable: Variables which are experimentally manipulated by an investigator are called independent variables.
•Dependent variable: Variables which are measured are called dependent variables (because they are presumed to depend on the value of the independent variable).
ANOVA jargon:
•Way = an independent variable, so a one-way ANOVA has one independent variable, two-way ANOVA has two independent variables, etc. Simple ANOVA tests the hypothesis that means from two or more samples are equal (drawn from populations with the same mean). Student's t-test is actually a particular application of one-way ANOVA (two groups compared).
•Factor = a test or measurement. Single-factor ANOVA tests whether the means of the groups being compared are equal and returns a yes/no answer, two-factor ANOVA simultaneously tests two or more factors, e.g. tumour size after treatment with different drugs and/or radiotherapy (drug treatment is one factor and radiotherapy is another). So, "factor" and "way" are alternative terms for the same thing (inpependent variables).
•Repeated measures: Used when members of a sample are measured under different conditions. As the sample is exposed to each condition, the measurement of the dependent variable is repeated. Using standard ANOVA is not appropriate because it fails to take into account correlation between the repeated measures, violating the assumption of independence. This approach can be used for several reasons, e.g. where research requires repeated measures, such as longitudinal research which measures each sample member at each of several ages - age is a repeated factor. This is comparable to a paired t-test.
The array of options for different ANOVA tests in SPSS is confusing, so I'll go through the most important bits using some examples.
One-Way / Single-Factor ANOVA:
Data:
Pain Scores for Analgesics
Drug: Pain Score:
Diclofenac 0, 35, 31, 29, 20, 7, 43, 16
Ibuprophen 30, 40, 27, 25, 39, 15, 30, 45
Paracetamol 16, 33, 25, 32, 21, 54, 57, 19
Asprin 55, 58, 56, 57, 56, 53, 59, 55
Since it would be unethical to withhold pain relief, there is no control group and we are just interested in knowing whether one drug performs better (lower pain score) than another, so we need to perform a one-way/single-factor ANOVA.
We enter this data into SPSS using dummy values (1, 2, 3, 4) for the drugs so this numeric data can be used in the ANOVA:
It's always a good idea to enter descriptive labels for data into the Variable View window, or the output is difficult to interpret!
EDA (Analyzer: Descriptive Statistics: Explore) shows that the data is normally distributed, so we can proceed with the ANOVA:
Analyze: Compare Means: One-Way ANOVA
Dependent variable: Pain Score
Factor: Drug:
•SPSS allows many different post hoc tests. Click Post Hoc and select the Tukey and Games-Howell tests.
◦The Tukey test is powerful and widely accepted, but is parametric in that it assumes that the population variances are equal. It also assumes that the sample sizes are equal. If this is not the case, you should use Gabriel's procedure, or if the sizes are very different, use Hochberg's GT2.
◦Games-Howell does not assume population variances are equal or that sample sizes are equal, so is a good alternative if this turns out to be the case.
•Click Options and select Homogeneity of Variance Test, Brown-Forsythe and Welch. The homogeneity of variance test is important since this is an assumption of ANOVA, but if this assumption turns out to be broken, the Brown-Forsythe and Welch options will display alternative versions of the F statistic which means you may still be able to use the result.
•Click OK to run the tests.
Output:
Test of Homogeneity of Variances: Pain Levene Statistic df1 df2 Sig.
4.837 3 28 .008
The significance value for homogeneity of variances is <.05, so the variances of the groups are significantly different. Since this is an assumption of ANOVA, we need to be very careful in interpreting the outcome of this test:
ANOVA: Pain
Sum of Squares df Mean Square F Sig.
Between Groups 4956.375 3 1652.125 11.967 .000
Within Groups 3865.500 28 138.054
Total 8821.875 31
This is the main ANOVA result. The significance value comparing the groups (drugs) is <.05, so we could reject the null hypothesis (there is no difference in the mean pain scores with the four drugs). However, since the variances are significantly different, this might be the wrong answer. Fortunately, the Welch and Brown-Forsythe statistics can still be used in these circumstances:
Robust Tests of Equality of Means: Pain
Statistic df1 df2 Sig.
Welch 32.064 3 12.171 .000
Brown-Forsythe 11.967 3 18.889 .000
The significance value of these are both <.05, so we still reject the null hypothesis. However, this result does not tell us which drugs are responsible for the difference, so we need the post hoc test results:
Multiple Comparisons
Dependent Variable: Pain
(I) Drug (J) Drug Mean Difference (I-J) Std. Error Sig. 95% Confidence Interval
Lower Bound Upper Bound
Tukey HSD 1 2 -8.750 5.875 .457 -24.79 7.29
3 -9.500 5.875 .386 -25.54 6.54
4 -33.500(*) 5.875 .000 -49.54 -17.46
2 1 8.750 5.875 .457 -7.29 24.79
3 -.750 5.875 .999 -16.79 15.29
4 -24.750(*) 5.875 .001 -40.79 -8.71
3 1 9.500 5.875 .386 -6.54 25.54
2 .750 5.875 .999 -15.29 16.79
4 -24.000(*) 5.875 .002 -40.04 -7.96
4 1 33.500(*) 5.875 .000 17.46 49.54
2 24.750(*) 5.875 .001 8.71 40.79
3 24.000(*) 5.875 .002 7.96 40.04
Games-Howell 1 2 -8.750 6.176 .513 -27.05 9.55
3 -9.500 7.548 .602 -31.45 12.45
4 -33.500(*) 5.194 .001 -50.55 -16.45
2 1 8.750 6.176 .513 -9.55 27.05
3 -.750 6.485 .999 -20.09 18.59
4 -24.750(*) 3.471 .001 -36.03 -13.47
3 1 9.500 7.548 .602 -12.45 31.45
2 .750 6.485 .999 -18.59 20.09
4 -24.000(*) 5.558 .014 -42.26 -5.74
4 1 33.500(*) 5.194 .001 16.45 50.55
2 24.750(*) 3.471 .001 13.47 36.03
3 24.000(*) 5.558 .014 5.74 42.26
* The mean difference is significant at the .05 level.
The Tukey test relies on homogeneity of variance, so we ignore these results. The Games-Howell post-hoc test does not rely on homogeneity of variance (this is why we used two different post-hoc tests) and so can be used. SPSS kindly flags (*) which differences are significant!
Result: Drug 4 (Asprin) produces significantly different result from the other three drugs:
Formal Reporting: When we report the outcome of an ANOVA, we cite the value of the F ratio and give the number of degrees of freedom, outcome (in a neutral fashion) and significance value. So in this case:
There is a significant difference between the pain scores for asprin and the other three drugs tested, F(3,28) = 11.97, p < .05.
Two-Factor ANOVA
Do anti-cancer drugs have different effects in males and females?
Data:
Drug: cisplatin vinblastine 5-fluorouracil
Gender:
Female Male Female Male Female Male
Tumour
Size: 65 50 70 45 55 35
70 55 65 60 65 40
60 80 60 85 70 35
60 65 70 65 55 55
60 70 65 70 55 35
55 75 60 70 60 40
60 75 60 80 50 45
50 65 50 60 50 40
We enter this data into SPSS using dummy values for the drugs (1, 2, 3) and genders (1,2) so the coded data can be used in the ANOVA:
It's always a good idea to enter descriptive labels for data into the Variable View window, or the output is difficult to interpret!
EDA (Analyze: Descriptive Statistics: Explore) shows that the data is normally distributed, so we can proceed with the ANOVA:
Analyze: General Linear Model: Univariate
Dependent variable: Tumour Diameter
Fixed Factors: Gender, Drug:
Also select:
Post Hoc: Tukey and Games-Howell:
Options:
Display Means for: Gender, Drug, Gender*Drug
Descriptive Statistics
Homogeneity tests:
Output:
Levene's Test of Equality of Error Variances(a)
Dependent Variable: Diameter F df1 df2 Sig.
1.462 5 42 .223
Tests the null hypothesis that the error variance of the dependent variable is equal across groups.
a Design: Intercept+Gender+Drug+Gender * Drug
The significance result for homogeneity of variance is >.05, which shows that the error variance of the dependent variable is equal across the groups, i.e. the assumption of the ANOVA test has been met.
Tests of Between-Subjects Effects
Dependent Variable: Diameter Source Type III Sum of Squares df Mean Square F Sig.
Corrected Model 3817.188(a) 5 763.438 10.459 .000
Intercept 167442.188 1 167442.188 2294.009 .000
Gender 42.188 1 42.188 .578 .451
Drug 2412.500 2 1206.250 16.526 .000
Gender * Drug 1362.500 2 681.250 9.333 .000
Error 3065.625 42 72.991
Total 174325.000 48
Corrected Total 6882.813 47
a R Squared = .555 (Adjusted R Squared = .502)
The highlighted values are significant (<.05), but there is no effect of gender (p = 0.451). Again, this does not tell us which drugs behave differently, so again we need to look at the post hoc tests:
Multiple Comparisons
Dependent Variable: Diameter
(I) Drug (J) Drug Mean Difference (I-J) Std. Error Sig. 95% Confidence Interval
Lower Bound Upper Bound
Tukey HSD cisplatin vinblastine -1.25 3.021 .910 -8.59 6.09
5-flourouracil 14.38(*) 3.021 .000 7.04 21.71
vinblastine cisplatin 1.25 3.021 .910 -6.09 8.59
5-flourouracil 15.63(*) 3.021 .000 8.29 22.96
5-flourouracil cisplatin -14.38(*) 3.021 .000 -21.71 -7.04
vinblastine -15.63(*) 3.021 .000 -22.96 -8.29
Games-Howell cisplatin vinblastine -1.25 3.329 .925 -9.46 6.96
5-flourouracil 14.38(*) 3.534 .001 5.64 23.11
vinblastine cisplatin 1.25 3.329 .925 -6.96 9.46
5-flourouracil 15.63(*) 3.699 .001 6.50 24.75
5-flourouracil cisplatin -14.38(*) 3.534 .001 -23.11 -5.64
vinblastine -15.63(*) 3.699 .001 -24.75 -6.50
Based on observed means.
* The mean difference is significant at the .05 level.
In this example, we can use the Tukey or Games-Howell results. Again, SPSS helpfully flags which results have reached statistical significance. We already know from the main ANOVA table that the effect of gender is not significant, but the post hoc tests show which drugs produce significantly different outcomes.
Formal Reporting: When we report the outcome of an ANOVA, we cite the value of the F ratio and give the number of degrees of freedom, outcome (in a neutral fashion) and significance value. So in this case:
There is a significant difference between the tumour diameter for 5-flourouracil and the other two drugs tested, F(5,47) = 10.46, p < .05.
Repeated Measures ANOVA
Remember that one of the assumptions of ANOVA is independence of the groups being compared. In lots of circumstances, we want to test the same thing repeatedly, e.g:
•Patients with a chronic disease after 3, 6 and 12 months of drug treatment
•Repeated sampling from the same location, e.g. spring, summer, autumn and winter
•etc
This type of study reduces variability in the data and so increases the power to detect effects, but violates the assumption of independence, so as with the paired t-test, we need to use a special form of ANOVA called repeated measures. In a parametric test, the assumption that the relationship between pairs of groups is equal is called "sphericity". Violating sphericity means that the F statistic cannot be compared to the normal tables of F, and so software cannot calculate a significance value. SPSS includes a procedure called Mauchly's test which tells us if the assumption of sphericity has been violated:
•If Mauchly’s test statistic is significant (i.e. p 0.05) we conclude that the condition of sphericity has not been met.
•If, Mauchly’s test statistic is nonsignificant (i.e. p >.05) it is reasonable to conclude that the variances of differences are not significantly different.
If Mauchly’s test is significant then we cannot trust the F-ratios produced by SPSS unless we apply a correction (which, fortunately, SPSS helps us to do).
One-Way Repeated Measures ANOVA
i.e. one independent variable, e.g. pain score after surgery:
Patient1 Patient2 Patient3
1 3 1
2 5 3
4 6 6
5 7 4
5 9 1
6 10 3
This data can be entered directly into SPSS. Note that each column represents a repeated measures variable (patients in this case). There is no need for a coding variable (as with between-group designs, above):
It's always a good idea to enter descriptive labels for data into the Variable View window, or the output is difficult to interpret! Next:
Analyze: General Linear Model: Repeated Measures
Within-Subject factor name: Patient
Number of Levels: 3 (because there are 3 patients)
Click Add, then Define (factors):
There are no proper post hoc tests for repeated measures variables in SPSS. However, via the Options button, you can use the paired t-test procedure to compare all pairs of levels of the independent variable, and then apply a Bonferroni correction to the probability at which you accept any of these tests. The resulting probability value should be used as the criterion for statistical significance. A ‘Bonferroni correction’ is achieved by dividing the probability value (usually 0.05) by the number of tests conducted, e.g. if we compare all levels of the independent variable of these data, we make three comparisons and so the appropriate significance level is 0.05/3 = 0.0167. Therefore, we accept t-tests as being significant only if they have a p value <0.0167.
Output:
Mauchly's Test of Sphericity Within Subjects Effect Mauchly's W Approx. Chi-Square df Sig. Epsilon
Greenhouse-Geisser Huynh-Feldt Lower-bound
patient .094 9.437 2 .009 .525 .544 .500
Mauchly’s test is significant (p <.05) so we conclude that the assumption of sphericity has not been met.
Tests of Within-Subjects Effects Source
Type III Sum of Squares df Mean Square F Sig.
patient Sphericity Assumed 44.333 2 22.167 8.210 .008
Greenhouse-Geisser 44.333 1.050 42.239 8.210 .033
Huynh-Feldt 44.333 1.088 40.752 8.210 .031
Lower-bound 44.333 1.000 44.333 8.210 .035
Error(patient) Sphericity Assumed 27.000 10 2.700
Greenhouse-Geisser 27.000 5.248 5.145
Huynh-Feldt 27.000 5.439 4.964
Lower-bound 27.000 5.000 5.400
Because the significance values are <.05, we conclude that there was a significant difference between the three patients, but this test does not tell us which patients differed from each other. The next issue is which of the three corrections to use. Going back to Mauchly's test:
•If epsilon is >0.75, use the Huynh-Feldt correction.
•If epsilon is <0.75, or nothing is known about sphericity at all, use the Greenhouse-Geisser correction.
•In this example, the epsilon values from Mauchly's test values are 0.525 and 0.544, both <0.75, so we use the Greenhouse-Geisser corrected values. Using this correction, F is still significant because its p value is 0.033, which is <.05.
Post Hoc Tests:
Pairwise Comparisons (I) patient (J) patient Mean Difference (I-J) Std. Error Sig.(a) 95% Confidence Interval for Difference(a)
Lower Bound Upper Bound
1 2 -2.833(*) .401 .003 -4.252 -1.415
3 .833 .946 1.000 -2.509 4.176
2 1 2.833(*) .401 .003 1.415 4.252
3 3.667 1.282 .106 -.865 8.199
3 1 -.833 .946 1.000 -4.176 2.509
2 -3.667 1.282 .106 -8.199 .865
Based on estimated marginal means
* The mean difference is significant at the .05 level.
a Adjustment for multiple comparisons: Bonferroni.
Formal reporting:
Mauchly’s test indicated that the assumption of sphericity had been violated (chi-square = 9.44, p <.05), therefore degrees of freedom were corrected using Greenhouse-Geisser estimates of sphericity (epsilon = 0.53). The results show that the pain scores of the three patients differed significantly, F(1.05, 5.25) = 8.21, p <.05. Post hoc tests revealed that although the pain score of Patient2 was significantly higher than that of than Patient1 (p<.001), Patient3's score was not significantly differently from either of the other patients (both p>.05).
Two-Way Repeated Measures ANOVA
i.e. two independent variables:
In a study of the best way to keep fields free of weeds for an entire growing season, a farmer treated test plots in 10 fields with either five different concentrations of weedkiller (independent variable 1) or five different length blasts with a flamethrower (independent variable 2). At the end of they growing season, the number of weeds per square metre were counted. To exclude bias (e.g. pre-existing seedbank in the soil), the following year, the farmer repeated the experiment but this time the treatments the fields received were reversed:
Treatment: Weedkiller Flamethrower
Severity: 1 2 3 4 5 1 2 3 4 5
Field1 10 15 18 22 37 9 13 13 18 22
Field2 10 18 10 42 60 7 14 20 21 32
Field3 7 11 28 31 56 9 13 24 30 35
Field4 9 19 36 45 60 7 14 9 20 25
Field5 15 14 29 33 37 14 13 20 22 29
Field6 14 13 26 26 49 5 12 17 16 33
Field7 9 12 19 37 48 5 15 12 17 24
Field8 9 18 22 31 39 13 13 14 17 17
Field9 12 14 24 28 53 12 13 21 19 22
Field10 7 11 21 23 45 12 14 20 21 29
SPSS Data View:
It's always a good idea to enter descriptive labels for data into the Variable View window, or the output is difficult to interpret:
Analyze: General Linear Model: Repeated Measures
Define Within Subject Factors (remember, "factor" = test or treatment):
Treatment, (2 treatments, weedkiller or flamethrower) (SPSS only allows 8 characters for the name)
Severity (5 different severities):
Click Define and define Within Subject Variables:
As above, there are no post hoc tests for repeated measures ANOVA in SPSS, but via the Options button, we can apply a Bonferroni correction to the probability at which you accept any of the tests:
Output:
Mauchly's Test of Sphericity(b)
Measure: MEASURE_1 Within Subjects Effect Mauchly's W Approx. Chi-Square df Sig. Epsilon
Greenhouse-Geisser Huynh-Feldt Lower-bound
treatmen 1.000 .000 0 . 1.000 1.000 1.000
severity .092 17.685 9 .043 .552 .740 .250
treatmen * severity .425 6.350 9 .712 .747 1.000 .250
The outcome of Mauchly’s test is significant (p <.05) for the severity of treatment, so we need to correct the F-values for this, but not for the treatments themselves.
Tests of Within-Subjects Effects Source
Type III Sum of Squares df Mean Square F Sig.
treatmen Sphericity Assumed 1730.560 1 1730.560 34.078 .000
Greenhouse-Geisser 1730.560 1.000 1730.560 34.078 .000
Huynh-Feldt 1730.560 1.000 1730.560 34.078 .000
Lower-bound 1730.560 1.000 1730.560 34.078 .000
Error(treatmen) Sphericity Assumed 457.040 9 50.782
Greenhouse-Geisser 457.040 9.000 50.782
Huynh-Feldt 457.040 9.000 50.782
Lower-bound 457.040 9.000 50.782
severity Sphericity Assumed 9517.960 4 2379.490 83.488 .000
Greenhouse-Geisser 9517.960 2.209 4309.021 83.488 .000
Huynh-Feldt 9517.960 2.958 3217.666 83.488 .000
Lower-bound 9517.960 1.000 9517.960 83.488 .000
Error(severity) Sphericity Assumed 1026.040 36 28.501
Greenhouse-Geisser 1026.040 19.880 51.613
Huynh-Feldt 1026.040 26.622 38.541
Lower-bound 1026.040 9.000 114.004
treatmen * severity Sphericity Assumed 1495.240 4 373.810 20.730 .000
Greenhouse-Geisser 1495.240 2.989 500.205 20.730 .000
Huynh-Feldt 1495.240 4.000 373.810 20.730 .000
Lower-bound 1495.240 1.000 1495.240 20.730 .001
Error(treatmen*severity) Sphericity Assumed 649.160 36 18.032
Greenhouse-Geisser 649.160 26.903 24.129
Huynh-Feldt 649.160 36.000 18.032
Lower-bound 649.160 9.000 72.129
Since there was no violation of sphericity, we can look at the comparison of the two treatments without any correction. The significance value shows (0.000) that there was a significant difference between the two treatments, but does not tell us which treatments produced this effect.
The output also tells us the effect of the severity of treatments, but remember there was a violation of sphericity here, so we must look at the corrected F-ratios. All of the corrected values are highly significant and so we can use the Greenhouse-Geisser corrected values as these are the most conservative.
Pairwise Comparisons (I) severity (J) severity Mean Difference (I-J) Std. Error Sig.(a) 95% Confidence Interval for Difference(a)
Lower Bound Upper Bound
1 2 -4.200(*) .895 .011 -7.502 -.898
3 -10.400(*) 1.190 .000 -14.790 -6.010
4 -16.200(*) 1.764 .000 -22.709 -9.691
5 -27.850(*) 2.398 .000 -36.698 -19.002
2 1 4.200(*) .895 .011 .898 7.502
3 -6.200(*) 1.521 .028 -11.810 -.590
4 -12.000(*) 1.280 .000 -16.723 -7.277
5 -23.650(*) 2.045 .000 -31.197 -16.103
3 1 10.400(*) 1.190 .000 6.010 14.790
2 6.200(*) 1.521 .028 .590 11.810
4 -5.800 1.690 .075 -12.036 .436
5 -17.450(*) 2.006 .000 -24.852 -10.048
4 1 16.200(*) 1.764 .000 9.691 22.709
2 12.000(*) 1.280 .000 7.277 16.723
3 5.800 1.690 .075 -.436 12.036
5 -11.650(*) 1.551 .000 -17.373 -5.927
5 1 27.850(*) 2.398 .000 19.002 36.698
2 23.650(*) 2.045 .000 16.103 31.197
3 17.450(*) 2.006 .000 10.048 24.852
4 11.650(*) 1.551 .000 5.927 17.373
* The mean difference is significant at the .05 level.
a Adjustment for multiple comparisons: Bonferroni.
This shows that there was only one pair for which there was no significant difference: 40% weedkiller followed by 2 minutes flame thrower, and 2 minutes flame thrower followed by 40% weedkiller. The differences for all the other pairs are significant. It does not matter if the farmer uses weedkiller or a flamethrower, but how much weedkiller and how long a burst of flame does make a difference to weed control.
Formal report:
There was a significant main effect of the type of treatment, F(1, 9) = 34.08, p < .001.
There was a significant main effect of the severity of treatment, F(2.21, 19.88) = 83.49, p <.001.
Never, ever, run any statistical test without performing EDA first!
What's wrong with t-tests?
Nothing, except ...
If you want to compare three or more groups using t-tests with the usual 0.05 level of significance, you would have to compare the three groups pairwise (A to B, A to C, B to C), so the chance of getting the wrong result would be:
1 - (0.95 x 0.95 x 0.95) = 14.3%
If you wanted to compare four or more groups, the chance of getting the wrong result would be (0.95)6 = 26%, and for five groups, 40%. Not good, is it? So we use ANOVA. Never perform multiple t-tests: Anyone on this module discovered performing multiple t-tests when they should use ANOVA will be shot!
ANalysis Of VAriance (ANOVA) is such an important statistical method that it would be easy to spend a whole module on this test alone. Like the t-test, ANOVA is a parametric test which assumes:
•data is numerical data representing samples from normally distributed populations
•the variances of the groups are "similar"
•the sizes of the groups are "similar"
•the groups should be independent
so it's important to carry out EDA before starting AVOVA! In fact, ANOVA is quite a robust procedure, so as long as the groups are similar, the test is normally reliable.
ANOVA tests the null hypothesis that the means of all the groups being compared are equal, and produces a statistic called F which is equivalent to the t-statistic from a t-test. But there's a catch. If the means of all the groups tested by ANOVA are equal, fine. But if the result tells us to reject the null hypothesis, we still don't know which of the means differ. We solve this problem by performing what is known as a "post hoc" (after the event) test.
Reminder:
•Independent variable: Variables which are experimentally manipulated by an investigator are called independent variables.
•Dependent variable: Variables which are measured are called dependent variables (because they are presumed to depend on the value of the independent variable).
ANOVA jargon:
•Way = an independent variable, so a one-way ANOVA has one independent variable, two-way ANOVA has two independent variables, etc. Simple ANOVA tests the hypothesis that means from two or more samples are equal (drawn from populations with the same mean). Student's t-test is actually a particular application of one-way ANOVA (two groups compared).
•Factor = a test or measurement. Single-factor ANOVA tests whether the means of the groups being compared are equal and returns a yes/no answer, two-factor ANOVA simultaneously tests two or more factors, e.g. tumour size after treatment with different drugs and/or radiotherapy (drug treatment is one factor and radiotherapy is another). So, "factor" and "way" are alternative terms for the same thing (inpependent variables).
•Repeated measures: Used when members of a sample are measured under different conditions. As the sample is exposed to each condition, the measurement of the dependent variable is repeated. Using standard ANOVA is not appropriate because it fails to take into account correlation between the repeated measures, violating the assumption of independence. This approach can be used for several reasons, e.g. where research requires repeated measures, such as longitudinal research which measures each sample member at each of several ages - age is a repeated factor. This is comparable to a paired t-test.
The array of options for different ANOVA tests in SPSS is confusing, so I'll go through the most important bits using some examples.
One-Way / Single-Factor ANOVA:
Data:
Pain Scores for Analgesics
Drug: Pain Score:
Diclofenac 0, 35, 31, 29, 20, 7, 43, 16
Ibuprophen 30, 40, 27, 25, 39, 15, 30, 45
Paracetamol 16, 33, 25, 32, 21, 54, 57, 19
Asprin 55, 58, 56, 57, 56, 53, 59, 55
Since it would be unethical to withhold pain relief, there is no control group and we are just interested in knowing whether one drug performs better (lower pain score) than another, so we need to perform a one-way/single-factor ANOVA.
We enter this data into SPSS using dummy values (1, 2, 3, 4) for the drugs so this numeric data can be used in the ANOVA:
It's always a good idea to enter descriptive labels for data into the Variable View window, or the output is difficult to interpret!
EDA (Analyzer: Descriptive Statistics: Explore) shows that the data is normally distributed, so we can proceed with the ANOVA:
Analyze: Compare Means: One-Way ANOVA
Dependent variable: Pain Score
Factor: Drug:
•SPSS allows many different post hoc tests. Click Post Hoc and select the Tukey and Games-Howell tests.
◦The Tukey test is powerful and widely accepted, but is parametric in that it assumes that the population variances are equal. It also assumes that the sample sizes are equal. If this is not the case, you should use Gabriel's procedure, or if the sizes are very different, use Hochberg's GT2.
◦Games-Howell does not assume population variances are equal or that sample sizes are equal, so is a good alternative if this turns out to be the case.
•Click Options and select Homogeneity of Variance Test, Brown-Forsythe and Welch. The homogeneity of variance test is important since this is an assumption of ANOVA, but if this assumption turns out to be broken, the Brown-Forsythe and Welch options will display alternative versions of the F statistic which means you may still be able to use the result.
•Click OK to run the tests.
Output:
Test of Homogeneity of Variances: Pain Levene Statistic df1 df2 Sig.
4.837 3 28 .008
The significance value for homogeneity of variances is <.05, so the variances of the groups are significantly different. Since this is an assumption of ANOVA, we need to be very careful in interpreting the outcome of this test:
ANOVA: Pain
Sum of Squares df Mean Square F Sig.
Between Groups 4956.375 3 1652.125 11.967 .000
Within Groups 3865.500 28 138.054
Total 8821.875 31
This is the main ANOVA result. The significance value comparing the groups (drugs) is <.05, so we could reject the null hypothesis (there is no difference in the mean pain scores with the four drugs). However, since the variances are significantly different, this might be the wrong answer. Fortunately, the Welch and Brown-Forsythe statistics can still be used in these circumstances:
Robust Tests of Equality of Means: Pain
Statistic df1 df2 Sig.
Welch 32.064 3 12.171 .000
Brown-Forsythe 11.967 3 18.889 .000
The significance value of these are both <.05, so we still reject the null hypothesis. However, this result does not tell us which drugs are responsible for the difference, so we need the post hoc test results:
Multiple Comparisons
Dependent Variable: Pain
(I) Drug (J) Drug Mean Difference (I-J) Std. Error Sig. 95% Confidence Interval
Lower Bound Upper Bound
Tukey HSD 1 2 -8.750 5.875 .457 -24.79 7.29
3 -9.500 5.875 .386 -25.54 6.54
4 -33.500(*) 5.875 .000 -49.54 -17.46
2 1 8.750 5.875 .457 -7.29 24.79
3 -.750 5.875 .999 -16.79 15.29
4 -24.750(*) 5.875 .001 -40.79 -8.71
3 1 9.500 5.875 .386 -6.54 25.54
2 .750 5.875 .999 -15.29 16.79
4 -24.000(*) 5.875 .002 -40.04 -7.96
4 1 33.500(*) 5.875 .000 17.46 49.54
2 24.750(*) 5.875 .001 8.71 40.79
3 24.000(*) 5.875 .002 7.96 40.04
Games-Howell 1 2 -8.750 6.176 .513 -27.05 9.55
3 -9.500 7.548 .602 -31.45 12.45
4 -33.500(*) 5.194 .001 -50.55 -16.45
2 1 8.750 6.176 .513 -9.55 27.05
3 -.750 6.485 .999 -20.09 18.59
4 -24.750(*) 3.471 .001 -36.03 -13.47
3 1 9.500 7.548 .602 -12.45 31.45
2 .750 6.485 .999 -18.59 20.09
4 -24.000(*) 5.558 .014 -42.26 -5.74
4 1 33.500(*) 5.194 .001 16.45 50.55
2 24.750(*) 3.471 .001 13.47 36.03
3 24.000(*) 5.558 .014 5.74 42.26
* The mean difference is significant at the .05 level.
The Tukey test relies on homogeneity of variance, so we ignore these results. The Games-Howell post-hoc test does not rely on homogeneity of variance (this is why we used two different post-hoc tests) and so can be used. SPSS kindly flags (*) which differences are significant!
Result: Drug 4 (Asprin) produces significantly different result from the other three drugs:
Formal Reporting: When we report the outcome of an ANOVA, we cite the value of the F ratio and give the number of degrees of freedom, outcome (in a neutral fashion) and significance value. So in this case:
There is a significant difference between the pain scores for asprin and the other three drugs tested, F(3,28) = 11.97, p < .05.
Two-Factor ANOVA
Do anti-cancer drugs have different effects in males and females?
Data:
Drug: cisplatin vinblastine 5-fluorouracil
Gender:
Female Male Female Male Female Male
Tumour
Size: 65 50 70 45 55 35
70 55 65 60 65 40
60 80 60 85 70 35
60 65 70 65 55 55
60 70 65 70 55 35
55 75 60 70 60 40
60 75 60 80 50 45
50 65 50 60 50 40
We enter this data into SPSS using dummy values for the drugs (1, 2, 3) and genders (1,2) so the coded data can be used in the ANOVA:
It's always a good idea to enter descriptive labels for data into the Variable View window, or the output is difficult to interpret!
EDA (Analyze: Descriptive Statistics: Explore) shows that the data is normally distributed, so we can proceed with the ANOVA:
Analyze: General Linear Model: Univariate
Dependent variable: Tumour Diameter
Fixed Factors: Gender, Drug:
Also select:
Post Hoc: Tukey and Games-Howell:
Options:
Display Means for: Gender, Drug, Gender*Drug
Descriptive Statistics
Homogeneity tests:
Output:
Levene's Test of Equality of Error Variances(a)
Dependent Variable: Diameter F df1 df2 Sig.
1.462 5 42 .223
Tests the null hypothesis that the error variance of the dependent variable is equal across groups.
a Design: Intercept+Gender+Drug+Gender * Drug
The significance result for homogeneity of variance is >.05, which shows that the error variance of the dependent variable is equal across the groups, i.e. the assumption of the ANOVA test has been met.
Tests of Between-Subjects Effects
Dependent Variable: Diameter Source Type III Sum of Squares df Mean Square F Sig.
Corrected Model 3817.188(a) 5 763.438 10.459 .000
Intercept 167442.188 1 167442.188 2294.009 .000
Gender 42.188 1 42.188 .578 .451
Drug 2412.500 2 1206.250 16.526 .000
Gender * Drug 1362.500 2 681.250 9.333 .000
Error 3065.625 42 72.991
Total 174325.000 48
Corrected Total 6882.813 47
a R Squared = .555 (Adjusted R Squared = .502)
The highlighted values are significant (<.05), but there is no effect of gender (p = 0.451). Again, this does not tell us which drugs behave differently, so again we need to look at the post hoc tests:
Multiple Comparisons
Dependent Variable: Diameter
(I) Drug (J) Drug Mean Difference (I-J) Std. Error Sig. 95% Confidence Interval
Lower Bound Upper Bound
Tukey HSD cisplatin vinblastine -1.25 3.021 .910 -8.59 6.09
5-flourouracil 14.38(*) 3.021 .000 7.04 21.71
vinblastine cisplatin 1.25 3.021 .910 -6.09 8.59
5-flourouracil 15.63(*) 3.021 .000 8.29 22.96
5-flourouracil cisplatin -14.38(*) 3.021 .000 -21.71 -7.04
vinblastine -15.63(*) 3.021 .000 -22.96 -8.29
Games-Howell cisplatin vinblastine -1.25 3.329 .925 -9.46 6.96
5-flourouracil 14.38(*) 3.534 .001 5.64 23.11
vinblastine cisplatin 1.25 3.329 .925 -6.96 9.46
5-flourouracil 15.63(*) 3.699 .001 6.50 24.75
5-flourouracil cisplatin -14.38(*) 3.534 .001 -23.11 -5.64
vinblastine -15.63(*) 3.699 .001 -24.75 -6.50
Based on observed means.
* The mean difference is significant at the .05 level.
In this example, we can use the Tukey or Games-Howell results. Again, SPSS helpfully flags which results have reached statistical significance. We already know from the main ANOVA table that the effect of gender is not significant, but the post hoc tests show which drugs produce significantly different outcomes.
Formal Reporting: When we report the outcome of an ANOVA, we cite the value of the F ratio and give the number of degrees of freedom, outcome (in a neutral fashion) and significance value. So in this case:
There is a significant difference between the tumour diameter for 5-flourouracil and the other two drugs tested, F(5,47) = 10.46, p < .05.
Repeated Measures ANOVA
Remember that one of the assumptions of ANOVA is independence of the groups being compared. In lots of circumstances, we want to test the same thing repeatedly, e.g:
•Patients with a chronic disease after 3, 6 and 12 months of drug treatment
•Repeated sampling from the same location, e.g. spring, summer, autumn and winter
•etc
This type of study reduces variability in the data and so increases the power to detect effects, but violates the assumption of independence, so as with the paired t-test, we need to use a special form of ANOVA called repeated measures. In a parametric test, the assumption that the relationship between pairs of groups is equal is called "sphericity". Violating sphericity means that the F statistic cannot be compared to the normal tables of F, and so software cannot calculate a significance value. SPSS includes a procedure called Mauchly's test which tells us if the assumption of sphericity has been violated:
•If Mauchly’s test statistic is significant (i.e. p 0.05) we conclude that the condition of sphericity has not been met.
•If, Mauchly’s test statistic is nonsignificant (i.e. p >.05) it is reasonable to conclude that the variances of differences are not significantly different.
If Mauchly’s test is significant then we cannot trust the F-ratios produced by SPSS unless we apply a correction (which, fortunately, SPSS helps us to do).
One-Way Repeated Measures ANOVA
i.e. one independent variable, e.g. pain score after surgery:
Patient1 Patient2 Patient3
1 3 1
2 5 3
4 6 6
5 7 4
5 9 1
6 10 3
This data can be entered directly into SPSS. Note that each column represents a repeated measures variable (patients in this case). There is no need for a coding variable (as with between-group designs, above):
It's always a good idea to enter descriptive labels for data into the Variable View window, or the output is difficult to interpret! Next:
Analyze: General Linear Model: Repeated Measures
Within-Subject factor name: Patient
Number of Levels: 3 (because there are 3 patients)
Click Add, then Define (factors):
There are no proper post hoc tests for repeated measures variables in SPSS. However, via the Options button, you can use the paired t-test procedure to compare all pairs of levels of the independent variable, and then apply a Bonferroni correction to the probability at which you accept any of these tests. The resulting probability value should be used as the criterion for statistical significance. A ‘Bonferroni correction’ is achieved by dividing the probability value (usually 0.05) by the number of tests conducted, e.g. if we compare all levels of the independent variable of these data, we make three comparisons and so the appropriate significance level is 0.05/3 = 0.0167. Therefore, we accept t-tests as being significant only if they have a p value <0.0167.
Output:
Mauchly's Test of Sphericity Within Subjects Effect Mauchly's W Approx. Chi-Square df Sig. Epsilon
Greenhouse-Geisser Huynh-Feldt Lower-bound
patient .094 9.437 2 .009 .525 .544 .500
Mauchly’s test is significant (p <.05) so we conclude that the assumption of sphericity has not been met.
Tests of Within-Subjects Effects Source
Type III Sum of Squares df Mean Square F Sig.
patient Sphericity Assumed 44.333 2 22.167 8.210 .008
Greenhouse-Geisser 44.333 1.050 42.239 8.210 .033
Huynh-Feldt 44.333 1.088 40.752 8.210 .031
Lower-bound 44.333 1.000 44.333 8.210 .035
Error(patient) Sphericity Assumed 27.000 10 2.700
Greenhouse-Geisser 27.000 5.248 5.145
Huynh-Feldt 27.000 5.439 4.964
Lower-bound 27.000 5.000 5.400
Because the significance values are <.05, we conclude that there was a significant difference between the three patients, but this test does not tell us which patients differed from each other. The next issue is which of the three corrections to use. Going back to Mauchly's test:
•If epsilon is >0.75, use the Huynh-Feldt correction.
•If epsilon is <0.75, or nothing is known about sphericity at all, use the Greenhouse-Geisser correction.
•In this example, the epsilon values from Mauchly's test values are 0.525 and 0.544, both <0.75, so we use the Greenhouse-Geisser corrected values. Using this correction, F is still significant because its p value is 0.033, which is <.05.
Post Hoc Tests:
Pairwise Comparisons (I) patient (J) patient Mean Difference (I-J) Std. Error Sig.(a) 95% Confidence Interval for Difference(a)
Lower Bound Upper Bound
1 2 -2.833(*) .401 .003 -4.252 -1.415
3 .833 .946 1.000 -2.509 4.176
2 1 2.833(*) .401 .003 1.415 4.252
3 3.667 1.282 .106 -.865 8.199
3 1 -.833 .946 1.000 -4.176 2.509
2 -3.667 1.282 .106 -8.199 .865
Based on estimated marginal means
* The mean difference is significant at the .05 level.
a Adjustment for multiple comparisons: Bonferroni.
Formal reporting:
Mauchly’s test indicated that the assumption of sphericity had been violated (chi-square = 9.44, p <.05), therefore degrees of freedom were corrected using Greenhouse-Geisser estimates of sphericity (epsilon = 0.53). The results show that the pain scores of the three patients differed significantly, F(1.05, 5.25) = 8.21, p <.05. Post hoc tests revealed that although the pain score of Patient2 was significantly higher than that of than Patient1 (p<.001), Patient3's score was not significantly differently from either of the other patients (both p>.05).
Two-Way Repeated Measures ANOVA
i.e. two independent variables:
In a study of the best way to keep fields free of weeds for an entire growing season, a farmer treated test plots in 10 fields with either five different concentrations of weedkiller (independent variable 1) or five different length blasts with a flamethrower (independent variable 2). At the end of they growing season, the number of weeds per square metre were counted. To exclude bias (e.g. pre-existing seedbank in the soil), the following year, the farmer repeated the experiment but this time the treatments the fields received were reversed:
Treatment: Weedkiller Flamethrower
Severity: 1 2 3 4 5 1 2 3 4 5
Field1 10 15 18 22 37 9 13 13 18 22
Field2 10 18 10 42 60 7 14 20 21 32
Field3 7 11 28 31 56 9 13 24 30 35
Field4 9 19 36 45 60 7 14 9 20 25
Field5 15 14 29 33 37 14 13 20 22 29
Field6 14 13 26 26 49 5 12 17 16 33
Field7 9 12 19 37 48 5 15 12 17 24
Field8 9 18 22 31 39 13 13 14 17 17
Field9 12 14 24 28 53 12 13 21 19 22
Field10 7 11 21 23 45 12 14 20 21 29
SPSS Data View:
It's always a good idea to enter descriptive labels for data into the Variable View window, or the output is difficult to interpret:
Analyze: General Linear Model: Repeated Measures
Define Within Subject Factors (remember, "factor" = test or treatment):
Treatment, (2 treatments, weedkiller or flamethrower) (SPSS only allows 8 characters for the name)
Severity (5 different severities):
Click Define and define Within Subject Variables:
As above, there are no post hoc tests for repeated measures ANOVA in SPSS, but via the Options button, we can apply a Bonferroni correction to the probability at which you accept any of the tests:
Output:
Mauchly's Test of Sphericity(b)
Measure: MEASURE_1 Within Subjects Effect Mauchly's W Approx. Chi-Square df Sig. Epsilon
Greenhouse-Geisser Huynh-Feldt Lower-bound
treatmen 1.000 .000 0 . 1.000 1.000 1.000
severity .092 17.685 9 .043 .552 .740 .250
treatmen * severity .425 6.350 9 .712 .747 1.000 .250
The outcome of Mauchly’s test is significant (p <.05) for the severity of treatment, so we need to correct the F-values for this, but not for the treatments themselves.
Tests of Within-Subjects Effects Source
Type III Sum of Squares df Mean Square F Sig.
treatmen Sphericity Assumed 1730.560 1 1730.560 34.078 .000
Greenhouse-Geisser 1730.560 1.000 1730.560 34.078 .000
Huynh-Feldt 1730.560 1.000 1730.560 34.078 .000
Lower-bound 1730.560 1.000 1730.560 34.078 .000
Error(treatmen) Sphericity Assumed 457.040 9 50.782
Greenhouse-Geisser 457.040 9.000 50.782
Huynh-Feldt 457.040 9.000 50.782
Lower-bound 457.040 9.000 50.782
severity Sphericity Assumed 9517.960 4 2379.490 83.488 .000
Greenhouse-Geisser 9517.960 2.209 4309.021 83.488 .000
Huynh-Feldt 9517.960 2.958 3217.666 83.488 .000
Lower-bound 9517.960 1.000 9517.960 83.488 .000
Error(severity) Sphericity Assumed 1026.040 36 28.501
Greenhouse-Geisser 1026.040 19.880 51.613
Huynh-Feldt 1026.040 26.622 38.541
Lower-bound 1026.040 9.000 114.004
treatmen * severity Sphericity Assumed 1495.240 4 373.810 20.730 .000
Greenhouse-Geisser 1495.240 2.989 500.205 20.730 .000
Huynh-Feldt 1495.240 4.000 373.810 20.730 .000
Lower-bound 1495.240 1.000 1495.240 20.730 .001
Error(treatmen*severity) Sphericity Assumed 649.160 36 18.032
Greenhouse-Geisser 649.160 26.903 24.129
Huynh-Feldt 649.160 36.000 18.032
Lower-bound 649.160 9.000 72.129
Since there was no violation of sphericity, we can look at the comparison of the two treatments without any correction. The significance value shows (0.000) that there was a significant difference between the two treatments, but does not tell us which treatments produced this effect.
The output also tells us the effect of the severity of treatments, but remember there was a violation of sphericity here, so we must look at the corrected F-ratios. All of the corrected values are highly significant and so we can use the Greenhouse-Geisser corrected values as these are the most conservative.
Pairwise Comparisons (I) severity (J) severity Mean Difference (I-J) Std. Error Sig.(a) 95% Confidence Interval for Difference(a)
Lower Bound Upper Bound
1 2 -4.200(*) .895 .011 -7.502 -.898
3 -10.400(*) 1.190 .000 -14.790 -6.010
4 -16.200(*) 1.764 .000 -22.709 -9.691
5 -27.850(*) 2.398 .000 -36.698 -19.002
2 1 4.200(*) .895 .011 .898 7.502
3 -6.200(*) 1.521 .028 -11.810 -.590
4 -12.000(*) 1.280 .000 -16.723 -7.277
5 -23.650(*) 2.045 .000 -31.197 -16.103
3 1 10.400(*) 1.190 .000 6.010 14.790
2 6.200(*) 1.521 .028 .590 11.810
4 -5.800 1.690 .075 -12.036 .436
5 -17.450(*) 2.006 .000 -24.852 -10.048
4 1 16.200(*) 1.764 .000 9.691 22.709
2 12.000(*) 1.280 .000 7.277 16.723
3 5.800 1.690 .075 -.436 12.036
5 -11.650(*) 1.551 .000 -17.373 -5.927
5 1 27.850(*) 2.398 .000 19.002 36.698
2 23.650(*) 2.045 .000 16.103 31.197
3 17.450(*) 2.006 .000 10.048 24.852
4 11.650(*) 1.551 .000 5.927 17.373
* The mean difference is significant at the .05 level.
a Adjustment for multiple comparisons: Bonferroni.
This shows that there was only one pair for which there was no significant difference: 40% weedkiller followed by 2 minutes flame thrower, and 2 minutes flame thrower followed by 40% weedkiller. The differences for all the other pairs are significant. It does not matter if the farmer uses weedkiller or a flamethrower, but how much weedkiller and how long a burst of flame does make a difference to weed control.
Formal report:
There was a significant main effect of the type of treatment, F(1, 9) = 34.08, p < .001.
There was a significant main effect of the severity of treatment, F(2.21, 19.88) = 83.49, p <.001.
Subscribe to:
Posts (Atom)