作业帮 > 英语 > 作业

英语翻译K-means (MacQueen,1967) is one of the simplest unsupervi

来源:学生作业帮 编辑:搜搜做题作业网作业帮 分类:英语作业 时间:2024/07/08 00:12:30
英语翻译
K-means (MacQueen,1967) is one of the simplest unsupervised learning algorithms that solve the well known clustering problem.The procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume k clusters) fixed a priori.The main idea is to define k centroids,one for each cluster.These centroids shoud be placed in a cunning way because of different location causes different result.So,the better choice is to place them as much as possible far away from each other.The next step is to take each point belonging to a given data set and associate it to the nearest centroid.When no point is pending,the first step is completed and an early groupage is done.At this point we need to re-calculate k new centroids as barycenters of the clusters resulting from the previous step.After we have these k new centroids,a new binding has to be done between the same data set points and the nearest new centroid.A loop has been generated.As a result of this loop we may notice that the k centroids change their location step by step until no more changes are done.In other words centroids do not move any more.
Finally,this algorithm aims at minimizing an objective function,in this case a squared error function.The objective function
,
where is a chosen distance measure between a data point and the cluster centre ,is an indicator of the distance of the n data points from their respective cluster centres.
The algorithm is composed of the following steps:
1.\x05Place K points into the space represented by the objects that are being clustered.These points represent initial group centroids.
2.\x05Assign each object to the group that has the closest centroid.
3.\x05When all objects have been assigned,recalculate the positions of the K centroids.
4.\x05Repeat Steps 2 and 3 until the centroids no longer move.This produces a separation of the objects into groups
5.\x05from which the metric to be minimized can be calculated.
Although it can be proved that the procedure will always terminate,the k-means algorithm does not necessarily find the most optimal configuration,corresponding to the global objective function minimum.The algorithm is also significantly sensitive to the initial randomly selected cluster centres.The k-means algorithm can be run multiple times to reduce this effect.
K-means is a simple algorithm that has been adapted to many problem domains.As we are going to see,it is a good candidate for extension to work with fuzzy feature vectors.
•\x05J.B.MacQueen (1967):"Some Methods for classification and Analysis of Multivariate Observations,Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability",Berkeley,University of California Press,1:281-297.
Andrew Moore:“K-means and Hierarchical Clustering - Tutorial Slides” .
Brian T.Luke:“K-Means Clustering”.
Tariq Rashid:“Clustering”.
Hans-Joachim Mucha and Hizir Sofyan:“Nonhierarchical Clustering”.
该问题6月3号被删除了。在百度知道投诉了三次后问题给显示出来了。翻译工作我自己已经弄好了。分数送出去了收不回来,但是对于那些一句话都不愿意自己翻译完全机器翻译的人,我情愿关闭问题。很遗憾我没有找到这样的。没有长句难句,到现在为止我还没有看到用心的人,还有几天时间,想要分的抓紧了
英语翻译K-means (MacQueen,1967) is one of the simplest unsupervi
你看这样可以吗?
/K -均值(麦奎,1967年)是最简单的无监督学习算法之一,解决著名的聚类问题.该过程遵循一个简单而容易的方法进行分类,通过一个给定一定数量的簇集数据(假设K表集群)修正了一个先验的.其主要思想是定义K表重心,每个集群之一.这些重心shoud在不同的位置,因为狡猾的道路上造成不同的结果.因此,更好的选择是把他们尽可能地远离对方.下一步是把每个点属于一个给定的数据集和它关联到最近的重心.如果没有点等候,第一步是完成和早期集运做.在这一点上,我们需要重新计算为从上一步产生的集群心法钾的新重心.当我们这些钾的新重心,一个新的绑定,必须设置点之间相同的数据和最近的新的质心做.一个循环已经生成.由于这种循环中,我们可能会注意到,在K重心位置改变自己一步步的步骤,直到没有更多的变化做的结果.换句话说在重心不移动了.
最后,该算法旨在最大限度地减少在这种情况下,目标函数,一平方误差函数.目标函数
,
该处是一个数据点之间的选择和聚类中心距离测度,是第n个数据点从各自的聚类中心的距离指标.
该算法是由以下几个步骤:
1.放入由正在聚集对象表示的空间K点.这些点代表首批重心.
2.分配到每个组有最密切的质心的对象.
3.当所有对象都被分配,重新计算的K重心位置.
4.重复步骤,直到不再重心移动2和3.这将生成一个对象的成组分离
5.从该指标达到最小化可以计算出来.
虽然可以证明该程序将永远终止时,K - means算法并不一定能找到最优化的配置,相应的全球目标函数的最小值.该算法也显着敏感的随机选择的初始聚类中心.的k - means算法可以多次运行,以减少这种影响.
K均值是一个简单的算法,已经适应许多问题领域.正如我们将会看到,这是一个很好的候选人延长,工作与模糊特征向量.
•巴顿麦奎(1967):伯克利分校,加州大学出版社,1:281-297大学“分类和多元观察,5次伯克利研讨会论文集数理统计和概率分析的几种方法”.
安德鲁摩尔:“K -均值和层次聚类 - 教程幻灯片”.
布莱恩吨卢克:“K -均值聚类”.
塔里克拉希德:“群集”.
汉斯约阿希姆木栅和Hizir Sofyan:“Nonhierarchical群集”.