[jira] [Commented] (CASSANDRA-3745) contrib/PIG example fails when column metadata exists for CF

Sasha Dolgy (Commented) (JIRA) Sat, 14 Jan 2012 09:02:03 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186259#comment-13186259
 ]


Sasha Dolgy commented on CASSANDRA-3745:
----------------------------------------

To recreate the CF (small one):  

create column family entity_relations with comparator=UTF8Type and 
column_metadata=[
    {column_name: owner_id, validation_class: UTF8Type, index_type: KEYS},
    {column_name: column_family, validation_class: UTF8Type, index_type: KEYS}
];

cassandra@ubuntu:~$ nodetool -h 127.0.0.1 version
ReleaseVersion: 1.0.6


                
> contrib/PIG example fails when column metadata exists for CF
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-3745
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3745
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Contrib
>    Affects Versions: 1.0.6
>            Reporter: Sasha Dolgy
>              Labels: cassandra, pig
>
> I have a sandbox CF for prototyping and it has 17 Secondary Indexes defined.  
> When I would run the contrib/PIG example, using pig 0.8.1 and even the pig 
> 0.8.3 jar, with Cassandra 1.0.6, I would receive the following error from the 
> second line of the example script [ cols = FOREACH rows GENERATE 
> flatten(columns); ]:
> 2012-01-14 06:54:27,551 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 1007: Found duplicates in schema. : 18 columns. Please alias the columns with 
> unique names.
> I proceeded to drop all of the indexes, and tried again.  Same error.  On 
> further inspection, show schema showed that the metadata still existed on the 
> CF from the indexes.  I ran the following: 
> update column family user with column_metadata = [];
> I can now run the full contrib/pig example against my CF.  
> *If I select another CF with 2 secondary indexes, the same behaviour persists:
> 2012-01-14 08:34:31,413 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 1007: Found duplicates in schema. : 3 columns. Please alias the columns with 
> unique names.
> grunt> describe users;
> 2012-01-14 08:36:58,227 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics 
> - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - 
> already initialized
> users: {key: bytearray,columns: {T: (name: chararray,value: 
> bytearray,column_family: chararray,value: bytearray,owner_id: 
> chararray,value: bytearray)}}
> grunt>
> grunt> dump users;
> <-- removed INFO/WARN output -->
> HadoopVersion   PigVersion      UserId  StartedAt       FinishedAt      
> Features
> 0.20.2  0.8.1   sasha   2012-01-14 08:37:24     2012-01-14 08:37:43     
> UNKNOWN
> Success!
> Job Stats (time in seconds):
> JobId   Alias   Feature Outputs
> job_local_0001  users   MAP_ONLY        
> file:/tmp/temp-1366421017/tmp-1001688304,
> Input(s):
> Successfully read records from: "cassandra://sdo/entity_relations"
> Output(s):
> Successfully stored records in: "file:/tmp/temp-1366421017/tmp-1001688304"
> Job DAG:
> job_local_0001
> (d1540edc-cb16-47dd-96e3-90e1657c2d77:a721966c6026ee85ef35f2108b75d3784b52bf1217f0b62564bdefe67b9504d9,{(content_id,d1540edc-cb16-47dd-96e3-90e1657c2d77:a721966c6026ee85ef35f2108b75d3784b52bf1217f0b62564bdefe67b9504d9),(owner_id,d1540edc-cb16-47dd-96e3-90e1657c2d77)})
> grunt>
> I have also tried this with PIG 0.9.1 but encounter 
> https://issues.apache.org/jira/browse/CASSANDRA-3371 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3745) contrib/PIG example fails when column metadata exists for CF

Reply via email to