4.4 Overall performance
The contingency tables of the clustering results with three clusters are depicted in Table 5. Part A of the table depicts the solution obtained with theoretical features, while Part B represents the solution obtained with POS features. Rows are gold standard classes and columns are clusters, labeled with the cluster number provided by the algorithm. The ordering of the cluster numbers corresponds to the quality of the cluster, measured in terms of the clustering criterion (see Equation (2)), 0 representing the cluster with the highest quality. In each cell Cij of Table 5, the number of adjectives of class i that are assigned to cluster j by the algorithm is given. The largest value for each class is highlighted (see gray cells).
First model: Three-way solution contingency tables for theoretical and POS features. Rows are gold standard classes, columns are clusters. Row TotalGS shows the number of Gold Standard lemmata and row Totalcl the total number of lemmata contained in each cluster. Note that the column labeled Total represents the row sum for each part (as the number of items per class is identical).
There is that cluster (cluster 0 in both solutions) containing the majority of relational adjectives on the gold standard. This is actually the very compact people with respect to the clustering requirement.
The fresh new talk concentrates on the latest team analyses which have about three and you will four groups because the basis is actually three groups (intensional, qualitative, and relational) and in addition we imagine all in all, five groups (very first classes as well as polysemous classes: intensional-qualitative and you may qualitative-relational)
Various other class (dos when you look at the services An effective, one in provider B) contains the most of qualitative adjectives in the gold standard, including most of the intensional and you may IQ adjectives.
Adjectives which might be polysemous ranging from a great qualitative and a great relational reading (QR) is actually strewn as a result of the clusters, despite the fact that let you know a tendency to become ascribed into relational party in provider B (party 0).
The 5-ways email address details are depicted into the Dining table six. To your one hand, the desk signifies that the 5-method construction located because of the clustering algorithm is extremely exactly like the three-ways build for the Table 5. As a result the three clusters within the Good and you may B have basically already been duplicated because of the three first clusters inside C and you can D, respectively. Simultaneously, the difference within formations acquired having fun with theoretic in place of POS features be visible on the five-way solutions. In the set-upwards of the test, we had questioned you to group per classification, also QR and IQ adjectives remote inside the a cluster of its own. This really is clearly perhaps not borne in Dining table 6. What we pick rather is that (a) the latest mixed clusters persevere and you may rating high in the brand new clustering traditional (look for groups 0 from inside the provider C and you can 0–one in solution D, that have a mix of Q, QR, and Roentgen adjectives), and (b) several more quick groups are formulated (clusters step three and you can cuatro both in choice) no clear interpretation, indicating that around three-means put-up matches top the structure exposed from the clustering formula.
Regarding the talk out-of Tables 5 and you may six we ending one to the three-means clustering match the goal classification better than the 5-means clustering, and this polysemous adjectives are not identified as a unique classification. This type of efficiency advise that modeling polysemous adjectives with regards to extra, cutting-edge categories is not an acceptable method (i go back to this time subsequently).
Recall we outlined theoretic and you may POS has examine the brand new formations gotten using commercially told and you will jackd sign up idea-independent provides. Next ability research, maybe not stated here to have room causes, shows a premier correlation within extremely detailed attributes of solutions A great and you may B. 3 It features the latest communications between them ability representations with admiration toward clustering abilities: New POS keeps elicited because so many discriminative because of the clustering algorithm was precisely those that match the fresh theoretical keeps. It correspondence shows you new resemblance amongst the choice obtained into the two types of signal and also at the same time will bring support on establish concept of new theoretical features.