In the same separate post that u r alluring to it was also discussed that you 
should upgrade to 0.9 which fixes that issue and running a seqdumper on 
clustered output should give the weight of the vectors and the distance of each 
vector from the cluster centroid.

Did u try running a seqdumper on the clustered output?






On Sunday, February 23, 2014 10:32 PM, Bikash Gupta <bikash.gupt...@gmail.com> 
wrote:
 
Thanks, make sense.

Now in a seperate post we discussed that "The Clustered output should display 
the vectors with the vectorid that belong to a specfic cluster along with the 
distance of that vector from the cluster center."

So, based on the above code, we are loosing few things for named vector

1. Weightage of vector, as its only prints vector name
2. Distance of that vector from the cluster center.

Will it be a good idea to modify the above code? 




On Mon, Feb 24, 2014 at 6:05 AM, Suneel Marthi <suneel_mar...@yahoo.com> wrote:

The key in the CSV is the clusterId (and not the named vector).
>
>Here's the complete code snippet which should make sense.
>
>{Code}
>
>    Cluster cluster = clusterWritable.getValue();
>    line.append(cluster.getId());
>    List<WeightedPropertyVectorWritable> points = 
>getClusterIdToPoints().get(cluster.getId());
>    if (points != null) {
>      for
 (WeightedPropertyVectorWritable point : points) {
>        Vector theVec = point.getVector();
>        line.append(',');
>
>        if (theVec instanceof NamedVector) {
>         
 line.append(((NamedVector)theVec).getName());
>        } else {
>          String vecStr = theVec.asFormatString();
>          //do some basic manipulations for display
>          vecStr = VEC_PATTERN.matcher(vecStr).replaceAll("_");
>          line.append(vecStr);
>        }
>      }
>      getWriter().append(line).append("\n");
>    }
>
>
>{Code}
>
>For each clusterId it prints the names of the Named Vectors in the cluster or 
>the vector
 itself (if not a named vector).
>Hope that clarifies.
>
>
>
>
>
>
>
>
>
>On Friday, February 21, 2014 2:13 AM, Bikash Gupta <bikash.gupt...@gmail.com> 
>wrote:
> 
>Suneel,
>
>I was going through code of CSVClusterWriter and found that if
 vector
>is an instance of NamedVector then it writes only Key.
>
>if (theVec instanceof NamedVector) {
>          line.append(((NamedVector)theVec).getName());
>        } else {
>          String vecStr = theVec.asFormatString();
>          //do some basic manipulations for display
>          vecStr = VEC_PATTERN.matcher(vecStr).replaceAll("_");
>          line.append(vecStr);
>        }
>
>Hence I am getting only key as an ouput of cluster dumper. Request you
>to specify the design assumption behind this....
>
>On Wed, Feb 19, 2014 at 10:36 PM, Bikash Gupta <bikash.gupt...@gmail.com> 
>wrote:
>> I am running cluster
 dumper
>>
>> After extracting output from Cluster dump I am transposing the row to
>> column, hence I have directly called this class from my java code.
>>
>> Code:
>>
>> ClusterDumper.main(new String[] {
>>                 buildOption(DefaultOptionCreator.INPUT_OPTION),seqFileDir,
>>                 buildOption(DefaultOptionCreator.OUTPUT_OPTION),outputFile,
>>                 buildOption(ClusterDumper.OUTPUT_FORMAT_OPT),format,
>>                 buildOption(ClusterDumper.POINTS_DIR_OPTION),pointsDir
>>                 });
>>
>> I have attached output too. Please note Key of Sequence File is
>> Text.class and its seperated using "`" character. I have also attached
>> Cluster
 Metadata
>>
>>
>>
>>
>> On Wed, Feb 19, 2014 at 9:21 PM, Suneel Marthi <suneel_mar...@yahoo.com> 
>> wrote:
>>> R u running clusterdump or seqdumper?
>>>
>>> Could u paste the commands that u had run and their respective outputs?
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wednesday, February 19, 2014 6:16 AM, Bikash Gupta 
>>> <bikash.gupt...@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> After running the cluster dumper on Kmeans output I am getting only
>>> Key of Sequence File.
>>>
>>> Options provided for cluster dumper is:-
>>>
>>> -i <<cluster-*-final of Kmeans>> -o <<Output
 File>>  -p
>>> <<clusteredPoint>> -of CSV
>>>
>>> Is it something that I am missing.
>>>
>>> PN: I am using sequential mode.
>>>
>>> --
>>> Regards
>>> Bikash Gupta
>>
>>
>>
>> --
>> Thanks & Regards
>> Bikash Kumar Gupta
>
>
>
>-- 
>Thanks & Regards
>Bikash Kumar Gupta
>
>
>


-- 
Thanks & Regards
Bikash Kumar Gupta 

Reply via email to