[ https://issues.apache.org/jira/browse/CASSANDRA-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186259#comment-13186259 ]
Sasha Dolgy commented on CASSANDRA-3745: ---------------------------------------- To recreate the CF (small one): create column family entity_relations with comparator=UTF8Type and column_metadata=[ {column_name: owner_id, validation_class: UTF8Type, index_type: KEYS}, {column_name: column_family, validation_class: UTF8Type, index_type: KEYS} ]; cassandra@ubuntu:~$ nodetool -h 127.0.0.1 version ReleaseVersion: 1.0.6 > contrib/PIG example fails when column metadata exists for CF > ------------------------------------------------------------ > > Key: CASSANDRA-3745 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3745 > Project: Cassandra > Issue Type: Bug > Components: Contrib > Affects Versions: 1.0.6 > Reporter: Sasha Dolgy > Labels: cassandra, pig > > I have a sandbox CF for prototyping and it has 17 Secondary Indexes defined. > When I would run the contrib/PIG example, using pig 0.8.1 and even the pig > 0.8.3 jar, with Cassandra 1.0.6, I would receive the following error from the > second line of the example script [ cols = FOREACH rows GENERATE > flatten(columns); ]: > 2012-01-14 06:54:27,551 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1007: Found duplicates in schema. : 18 columns. Please alias the columns with > unique names. > I proceeded to drop all of the indexes, and tried again. Same error. On > further inspection, show schema showed that the metadata still existed on the > CF from the indexes. I ran the following: > update column family user with column_metadata = []; > I can now run the full contrib/pig example against my CF. > *If I select another CF with 2 secondary indexes, the same behaviour persists: > 2012-01-14 08:34:31,413 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1007: Found duplicates in schema. : 3 columns. Please alias the columns with > unique names. > grunt> describe users; > 2012-01-14 08:36:58,227 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics > - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - > already initialized > users: {key: bytearray,columns: {T: (name: chararray,value: > bytearray,column_family: chararray,value: bytearray,owner_id: > chararray,value: bytearray)}} > grunt> > grunt> dump users; > <-- removed INFO/WARN output --> > HadoopVersion PigVersion UserId StartedAt FinishedAt > Features > 0.20.2 0.8.1 sasha 2012-01-14 08:37:24 2012-01-14 08:37:43 > UNKNOWN > Success! > Job Stats (time in seconds): > JobId Alias Feature Outputs > job_local_0001 users MAP_ONLY > file:/tmp/temp-1366421017/tmp-1001688304, > Input(s): > Successfully read records from: "cassandra://sdo/entity_relations" > Output(s): > Successfully stored records in: "file:/tmp/temp-1366421017/tmp-1001688304" > Job DAG: > job_local_0001 > (d1540edc-cb16-47dd-96e3-90e1657c2d77:a721966c6026ee85ef35f2108b75d3784b52bf1217f0b62564bdefe67b9504d9,{(content_id,d1540edc-cb16-47dd-96e3-90e1657c2d77:a721966c6026ee85ef35f2108b75d3784b52bf1217f0b62564bdefe67b9504d9),(owner_id,d1540edc-cb16-47dd-96e3-90e1657c2d77)}) > grunt> > I have also tried this with PIG 0.9.1 but encounter > https://issues.apache.org/jira/browse/CASSANDRA-3371 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira