[ 
https://issues.apache.org/jira/browse/CASSANDRA-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13189205#comment-13189205
 ] 

Brandon Williams commented on CASSANDRA-3740:
---------------------------------------------

bq. 1) "org.apache.cassandra.config.Config" do not initialize the all the 
properties and which results into the null pointer exception in a static block 
of class "DatabaseDescriptor" for example "conf.commitlog_sync"

Patches to address that.

bq. 2) I cant see any method in ConfigHelper to specify that I am using 
"Supercolumn"

"mapreduce.output.bulkoutputformat.issuper" controls that.

bq. 3) Also there is no method to specify comparator and subcomparator in 
ConfigHelper and it seems like comparators have been hard coded to "BytesType"

BytesType will sort correctly, the comparators are in the schema on the remote 
nodes.

bq. Apart From these issues I do not think that we are considering the TTL case 
in BulkOutputFormat.

CASSANDRA-3754 will handle this.
                
> While using BulkOutputFormat  unneccessarily look for the cassandra.yaml file.
> ------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3740
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3740
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 1.1
>            Reporter: Samarth Gahire
>            Assignee: Brandon Williams
>              Labels: cassandra, hadoop, mapreduce
>             Fix For: 1.1
>
>         Attachments: 0001-Make-DD-the-canonical-partitioner-source.txt, 
> 0002-Prevent-loading-from-yaml.txt, 0003-use-output-partitioner.txt, 
> 0004-update-BOF-for-new-dir-layout.txt
>
>
> I am trying to use BulkOutputFormat to stream the data from map of Hadoop 
> job. I have set the cassandra related configuration using ConfigHelper ,Also 
> have looked into Cassandra code seems Cassandra has taken care that it should 
> not look for the cassandra.yaml file.
> But still when I run the job i get the following error:
> {
> 12/01/13 11:30:04 WARN mapred.JobClient: Use GenericOptionsParser for parsing 
> the arguments. Applications should implement Tool for the same.
> 12/01/13 11:30:04 INFO input.FileInputFormat: Total input paths to process : 1
> 12/01/13 11:30:04 INFO mapred.JobClient: Running job: job_201201130910_0015
> 12/01/13 11:30:05 INFO mapred.JobClient:  map 0% reduce 0%
> 12/01/13 11:30:23 INFO mapred.JobClient: Task Id : 
> attempt_201201130910_0015_m_000000_0, Status : FAILED
> java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
> Caused by: java.io.IOException: Task process exit with nonzero status of 1.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
> attempt_201201130910_0015_m_000000_0: Cannot locate cassandra.yaml
> attempt_201201130910_0015_m_000000_0: Fatal configuration error; unable to 
> start server.
> }
> Also let me know how can i make this cassandra.yaml file available to Hadoop 
> mapreduce job?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to