clustering.core.k-means

K-Means clustering generates a specific number of disjoint, flat
(non-hierarchical) clusters. It is well suited to generating globular
clusters. The K-Means method is numerical, unsupervised, non-deterministic
and iterative.

 * There are always K clusters.
 * There is always at least one item in each cluster.
 * The clusters are non-hierarchical and they do not overlap.
 * Every member of a cluster is closer to its cluster than any other cluster
   because closeness does not always involve the 'center' of clusters.

The dataset is partitioned into K clusters and the data points are randomly
assigned to the clusters resulting in clusters that have roughly the same
number of data points.  For each data point: Calculate the distance from the
data point to each cluster.

If the data point is closest to its own cluster, leave it where it is. If the
data point is not closest to its own cluster, move it into the closest
cluster.

Repeat the above step until a complete pass through all the data points
results in no data point moving from one cluster to another. At this point
the clusters are stable and the clustering process ends.  The choice of
initial partition can greatly affect the final clusters that result, in
terms of inter-cluster and intracluster distances and cohesion.

centroids

(centroids distance-fn average-fn dataset means eta)

view source

classify

(classify distance-fn dataset means)

view source

cluster

(cluster distance-fn average-fn dataset means eta)

view source

converged?

(converged? distance-fn eta old-means new-means)

view source

find-closest

(find-closest distance-fn point means)

Determine which mean is closed to the specified point, according to the
supplied distance function (this should take 2 points and return a scalar
difference between them).

view source

init-means

(init-means k dataset)

view source

update

(update average-fn classified old-means)

view source

Generated by Codox

Clustering 0.2.0

Project

Topics

Namespaces

Public Vars

clustering.core.k-means

centroids

classify

cluster

converged?

find-closest

init-means

update