[ 
https://issues.apache.org/jira/browse/GORA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14345902#comment-14345902
 ] 

Lewis John McGibbney commented on GORA-416:
-------------------------------------------

OK so I've debugged this right through on the 
[FetcherJob|https://github.com/apache/nutch/blob/2.x/src/java/org/apache/nutch/fetcher/FetcherJob.java]
 task.
What is happening here is that we iterate through the UNION structure of the 
[protocolStatus 
field|https://github.com/apache/nutch/blob/2.x/src/gora/webpage.avsc#L58-L95] 
of the Nutch WebPage object, with the field value at position 1 being created 
as **protocolStatus_UnionIndex** and a [subColumn being created as we 
desire|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L590].
 
However once this has been done, when we come to the field value at position 1 
we use recursion on 
[addOrUpdateField|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L598]
 where we then encounter the [RECORD which is the actual 
protocolStatus|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L506].
 This one contains the actual value.
What happens now is that we [add this as a normal 
column|https://github.com/apache/gora/blob/master/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraStore.java#L512]
 instead of the super column that is is defined as.

> Error when populating data into Cassandra super column - 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc
> ------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: GORA-416
>                 URL: https://issues.apache.org/jira/browse/GORA-416
>             Project: Apache Gora
>          Issue Type: Bug
>          Components: gora-cassandra
>    Affects Versions: 0.6
>         Environment: Nutch 2.4-SNAPSHOT, Gora 0.6.1-SNAPSHOT, Hadoop 2.5.2, 
> Cassandra 2.0.7
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>            Priority: Blocker
>             Fix For: 0.6.1
>
>
> Error when populating data into Cassandra super column.
> {code}
> lmcgibbn@LMC-032857 /usr/local/2webgui/runtime/local(master) $ ./bin/nutch 
> fetch 1425410774-370456822
> FetcherJob: starting at 2015-03-03 11:27:57
> FetcherJob: batchId: 1425410774-370456822
> FetcherJob: threads: 10
> FetcherJob: parsing: false
> FetcherJob: resuming: false
> FetcherJob : timelimit set for : -1
> 2015-03-03 11:27:58.101 java[3267:1903] Unable to load realm info from 
> SCDynamicStore
> Using queue mode : byHost
> Fetcher: threads: 10
> QueueFeeder finished: total 1 records. Hit by time limit :0
> fetching http://nutch.apache.org/ (queue crawl delay=5000ms)
> -finishing thread FetcherThread1, activeThreads=1
> -finishing thread FetcherThread2, activeThreads=1
> -finishing thread FetcherThread3, activeThreads=1
> -finishing thread FetcherThread4, activeThreads=1
> -finishing thread FetcherThread5, activeThreads=1
> -finishing thread FetcherThread6, activeThreads=1
> -finishing thread FetcherThread7, activeThreads=1
> -finishing thread FetcherThread8, activeThreads=1
> Fetcher: throughput threshold: -1
> -finishing thread FetcherThread9, activeThreads=1
> Fetcher: throughput threshold sequence: 5
> -finishing thread FetcherThread0, activeThreads=0
> 0/0 spinwaiting/active, 1 pages, 0 errors, 0.2 0 pages/s, 82 82 kb/s, 0 URLs 
> in 0 queues
> -activeThreads=0
> me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
> InvalidRequestException(why:supercolumn parameter is not optional for super 
> CF sc)
>       at 
> me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52)
>       at 
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:260)
>       at 
> me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113)
>       at 
> me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
>       at 
> me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:69)
>       at 
> org.apache.gora.cassandra.store.HectorUtils.insertColumn(HectorUtils.java:46)
>       at 
> org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:293)
>       at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:512)
>       at 
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:598)
>       at 
> org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:316)
>       at 
> org.apache.gora.cassandra.store.CassandraStore.close(CassandraStore.java:160)
>       at 
> org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:56)
>       at 
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550)
>       at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629)
>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: InvalidRequestException(why:supercolumn parameter is not optional 
> for super CF sc)
>       at 
> org.apache.cassandra.thrift.Cassandra$batch_mutate_result$batch_mutate_resultStandardScheme.read(Cassandra.java:28082)
>       at 
> org.apache.cassandra.thrift.Cassandra$batch_mutate_result$batch_mutate_resultStandardScheme.read(Cassandra.java:28068)
>       at 
> org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:28002)
>       at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>       at 
> org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:1060)
>       at 
> org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:1046)
>       at 
> me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246)
>       at 
> me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:243)
>       at 
> me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104)
>       at 
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:253)
>       ... 19 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to