[jira] [Commented] (CASSANDRA-3740) While using BulkOutputFormat unneccessarily look for the cassandra.yaml file.

Samarth Gahire (Commented) (JIRA) Wed, 01 Feb 2012 23:40:35 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198577#comment-13198577
 ]


Samarth Gahire commented on CASSANDRA-3740:
-------------------------------------------

Cool! Its Working Perfect with the updated patches.
Can you please explain 
1) what is the significance of "INPUT_INITIAL_THRIFT_ADDRESS" for 
BulkOutPutFormat.
2) What am I suppose to provide there?(If it is needed)
3) Is there any need to provide Listen address of the Hadoop Nodes for 
BulkOutputFormat if yes How to provide the same?

Actually we are experiencing the problem while loading the data where it fails 
to connect if the host the M/R job is running on is dualstack, i.e. has both 
IPv4 and IPv6. 
Also it works when cassandra.yaml is provided ,may be it is reading listen 
address or something from cassandra.yaml.
                
> While using BulkOutputFormat  unneccessarily look for the cassandra.yaml file.
> ------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3740
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3740
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 1.1
>            Reporter: Samarth Gahire
>            Assignee: Brandon Williams
>              Labels: cassandra, hadoop, mapreduce
>             Fix For: 1.1
>
>         Attachments: 0001-Make-DD-the-canonical-partitioner-source.txt, 
> 0002-Prevent-loading-from-yaml.txt, 0003-use-output-partitioner.txt, 
> 0004-update-BOF-for-new-dir-layout.txt
>
>
> I am trying to use BulkOutputFormat to stream the data from map of Hadoop 
> job. I have set the cassandra related configuration using ConfigHelper ,Also 
> have looked into Cassandra code seems Cassandra has taken care that it should 
> not look for the cassandra.yaml file.
> But still when I run the job i get the following error:
> {
> 12/01/13 11:30:04 WARN mapred.JobClient: Use GenericOptionsParser for parsing 
> the arguments. Applications should implement Tool for the same.
> 12/01/13 11:30:04 INFO input.FileInputFormat: Total input paths to process : 1
> 12/01/13 11:30:04 INFO mapred.JobClient: Running job: job_201201130910_0015
> 12/01/13 11:30:05 INFO mapred.JobClient:  map 0% reduce 0%
> 12/01/13 11:30:23 INFO mapred.JobClient: Task Id : 
> attempt_201201130910_0015_m_000000_0, Status : FAILED
> java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
> Caused by: java.io.IOException: Task process exit with nonzero status of 1.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
> attempt_201201130910_0015_m_000000_0: Cannot locate cassandra.yaml
> attempt_201201130910_0015_m_000000_0: Fatal configuration error; unable to 
> start server.
> }
> Also let me know how can i make this cassandra.yaml file available to Hadoop 
> mapreduce job?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3740) While using BulkOutputFormat unneccessarily look for the cassandra.yaml file.

Reply via email to