JCIS   2011 Vol. 7 (7) : 2577- 2584
Title: A Simple and Accurate Approach to Hierarchical Clustering
Authors: Jianfu LI, Jianshuang LI, Huaiqing HE
Abstract: How to quickly and accurately choose a pair of clusters with the highest degree of similarity to merge is the most important step in hierarchical clustering. A widely used evolutionary tree reconstruction algorithm in computational biology, Neighbor joining, defined a similarity metrics based on Q-criterion. A great deal of empirical testing and theoretical studies have showed that the Q-criterion is linear in distances, permutation equivariant, consistent. Motivated by Neighbor joining, this paper proposes a Q-criterion based hierarchical clustering algorithm, named HACNJ. The main contribution of HACNJ is to firstly introduce the Q-criterion to clustering. In theory, by using Q-criterion, HACNJ is more accurate than the basic hierarchy clustering. Moreover, HACNJ has the same complexity with the basic hierarchical clustering, that is, the improvement of the accuracy doesn’t lead to the increase in complexity. The final experiment on Iris dataset verifies that HACNJ is effective.
Keywords: Hierarchical Clustering; Agglomerative Clustering; Similarity Metrics; Neighbor Joining; Q-Criterion