Hi Cassandra Users-
I have a Hadoop job that uses the pattern in Cassandra 2.0.6's
hadoop_cql3_word_count example to load data from HDFS into Cassandra. Having
read about BulkOutputFormat as a way to potentially significantly increase the
write throughput from Hadoop to Cassandra,
eed to use Hadoop then try the SSTableSimpleWriter and
> sstableloader , this post is a little old but still relevant
> http://www.datastax.com/dev/blog/bulk-loading
>
> Otherwise AFAIK BulkOutputFormat is what you want from hadoop
> http://www.datastax.com/docs/1.1/cluster_archit
gt;> On Wed, Dec 11, 2013 at 7:58 PM, Aaron Morton
>>> wrote:
>>>
>>>> If you don’t need to use Hadoop then try the SSTableSimpleWriter and
>>>> sstableloader , this post is a little old but still relevant
>>>> http://www.datastax.com/dev/blog/b
pc_timeout.
>>
>>
>> On Wed, Dec 11, 2013 at 7:58 PM, Aaron Morton wrote:
>>
>>> If you don’t need to use Hadoop then try the SSTableSimpleWriter and
>>> sstableloader , this post is a little old but still relevant
>>> http://www.datastax.com/dev/b
.com/dev/blog/bulk-loading
>>
>> Otherwise AFAIK BulkOutputFormat is what you want from hadoop
>> http://www.datastax.com/docs/1.1/cluster_architecture/hadoop_integration
>>
>> Cheers
>>
>> -
>> Aaron Morton
>> New Zealand
>> @aaronm
, 2013 at 7:58 PM, Aaron Morton wrote:
> If you don’t need to use Hadoop then try the SSTableSimpleWriter and
> sstableloader , this post is a little old but still relevant
> http://www.datastax.com/dev/blog/bulk-loading
>
> Otherwise AFAIK BulkOutputFormat is what you want fro
If you don’t need to use Hadoop then try the SSTableSimpleWriter and
sstableloader , this post is a little old but still relevant
http://www.datastax.com/dev/blog/bulk-loading
Otherwise AFAIK BulkOutputFormat is what you want from hadoop
http://www.datastax.com/docs/1.1/cluster_architecture
Hi All,
I want to bulk insert data into cassandra. I was wondering of using
BulkOutputformat in hadoop. Is it the best way or using driver and doing
batch insert is the better way.
Are there any disandvantages of using bulkoutputformat.
Thanks for helping
Varun
, January 24, 2013 at 6:49 AM, Alexei Bakanov wrote:
>
> Hello,
>
> We see that BulkOutputFormat fails to stream data from multiple reduce
> instances that run on the same host.
> We get the same error messages that issue
> https://issues.apache.org/jira/browse/CASSANDRA-4223 tries
Alexel,
You were right.
It was already fixed to use UUID for streaming session and released in 1.2.0.
See https://issues.apache.org/jira/browse/CASSANDRA-4813.
On Thursday, January 24, 2013 at 6:49 AM, Alexei Bakanov wrote:
> Hello,
>
> We see that BulkOutputFormat fails to stream
Hello,
We see that BulkOutputFormat fails to stream data from multiple reduce
instances that run on the same host.
We get the same error messages that issue
https://issues.apache.org/jira/browse/CASSANDRA-4223 tries to address.
Looks like (ip-adress + in_out_flag + atomic integer) is not unique
<mailto:user@cassandra.apache.org>"
mailto:user@cassandra.apache.org>>
Date: Thursday, January 17, 2013 10:39 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
mailto:user@cassandra.apache.org>>
Subject: BulkOutputFormat
Hello,
I am facing issu
"user@cassandra.apache.org"
> Date: Thursday, January 17, 2013 10:39 AM
> To: "user@cassandra.apache.org"
> Subject: BulkOutputFormat
>
> Hello,
>
> I am facing issues with Bulkoutputformat loading data from hadoop to
> cassandra.
>
> Cluster detai
y, January 17, 2013 10:39 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
mailto:user@cassandra.apache.org>>
Subject: BulkOutputFormat
Hello,
I am facing issues with Bulkoutputformat loading data from hadoop to cassandra.
Cluster details :
we have 15 nod
The problem was with the compatibility. I was using a lower version of
Cassandra jar files. Now, BulkOutputFormat works fine.
-Original Message-
From: anand_balara...@homedepot.com [mailto:anand_balara...@homedepot.com]
Sent: Friday, December 14, 2012 12:37 AM
To: user
: user@cassandra.apache.org
Subject: Re: BulkOutputFormat error -
org.apache.thrift.transport.TTransportException
Looks like it cannot connect to the server
>conf.set("cassandra.output.thrift.address", "localhost");
Is this the same address as the rpc_address in the cas
land
@aaronmorton
http://www.thelastpickle.com
On 14/12/2012, at 9:57 AM, anand_balara...@homedepot.com wrote:
> Hi
>
> I am a newbie to Cassandra. Was trying out a sample (word count) code on
> BulkOutputFormat and got stuck with an error.
>
> What I am trying to do is – migrate
Hi
I am a newbie to Cassandra. Was trying out a sample (word count) code on
BulkOutputFormat and got stuck with an error.
What I am trying to do is - migrate all Hive tables (from Hadoop cluster) to
Cassandra column families.
My MR program is configured to run on Hadoop cluster v 0.20.2
Hello,
Is BulkOutputFormat intended to be compatible with MRv1 (mapred) at
all? I'm trying to write to Cassandra, roughly following the example
at
http://shareitexploreit.blogspot.se/2012/03/bulkloadto-cassandra-with-hadoop.html
but with MRv1 - that is, calling output.collect(r
.org>>
Date: Wednesday, October 17, 2012 12:25 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
mailto:user@cassandra.apache.org>>
Subject: EOFException with BulkOutputFormat in 1.1.6
I'm getting EOFExceptio
I'm getting EOFExceptions with BulkOutputFormat
2012-10-17 12:23:01,182 ERROR
org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor: Error in
ThreadPoolExecutor
java.lang.RuntimeException: java.io.EOFException
at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:62
cluster.Does that make sense?
Cheers,Ralph
From: matgan...@hotmail.com
To: user@cassandra.apache.org
Subject: RE: Problem while streaming SSTables with BulkOutputFormat
Date: Tue, 9 Oct 2012 22:29:41 +
Aaron,Thank you for your answer, I tried to move to Cassandra 1.1.5, but the
error still
of this issue?
CheersRalph
> Subject: Re: Problem while streaming SSTables with BulkOutputFormat
> From: aa...@thelastpickle.com
> Date: Wed, 10 Oct 2012 10:05:13 +1300
> To: user@cassandra.apache.org
>
> Something, somewhere, at some point is breaking the connection. Sorry I
&
both a task tacker and cassandra ?
Cheers
On 9/10/2012, at 4:06 AM, Ralph Romanos wrote:
> Hello,
>
> I am using BulkOutputFormat to load data from a .csv file into Cassandra. I
> am using Cassandra 1.1.3 and Hadoop 0.20.2.
> I have 7 hadoop nodes: 1 namenode/jobtracker
Hello,
I am using BulkOutputFormat to load data from a .csv file into Cassandra. I am
using Cassandra 1.1.3 and Hadoop 0.20.2.I have 7 hadoop nodes: 1
namenode/jobtracker and 6 datanodes/tasktrackers. Cassandra is installed on 4
of these 6 datanodes/tasktrackers.The issue happens when I have
Jeltema
> wrote:
>
>> I'm trying to do a bulk load from a Cassandra/Hadoop job using the
>> BulkOutputFormat class.
>> It appears that the reducers are generating the SSTables, but is failing to
>> load them into the cluster:
>>
>> 12/09/14 14:08:13
eason?
If the temp dir hasn't been cleaned up yet, you are able to retry, fwiw.
Jeremy
On Sep 14, 2012, at 1:34 PM, Brian Jeltema
wrote:
> I'm trying to do a bulk load from a Cassandra/Hadoop job using the
> BulkOutputFormat class.
> It appears that the reducers are generat
I'm trying to do a bulk load from a Cassandra/Hadoop job using the
BulkOutputFormat class.
It appears that the reducers are generating the SSTables, but is failing to
load them into the cluster:
12/09/14 14:08:13 INFO mapred.JobClient: Task Id :
attempt_201208201337_0184_r_04_0, S
We're working on this over at
https://issues.apache.org/jira/browse/CASSANDRA-4208
On Fri, May 4, 2012 at 4:56 PM, Shawna Qian wrote:
> Hi Group:
>
> I am following this great example to use bulkouputformat to streaming the
> data from hadoop to cassandra.
> http://shareitexploreit.blogspot.com/2
Hi Group:
I am following this great example to use bulkouputformat to streaming the
data from hadoop to cassandra.
http://shareitexploreit.blogspot.com/2012/03/bulkloadto-cassandra-with-hado
op.html. It works perfectly when my keyspace has one cf.
But in my case, I have 2 coulumn families defined
On Wed, May 2, 2012 at 2:23 PM, Shawna Qian wrote:
> Hello:
>
> I am trying to use bulkoutputformat and seeing some nice docs on how to use
> it to stream the data to an existing cassandra cluster using configHelper
> class. I am wondering if it is possible to use it just to
Hello:
I am trying to use bulkoutputformat and seeing some nice docs on how to use it
to stream the data to an existing cassandra cluster using configHelper class.
I am wondering if it is possible to use it just to stream the data (sstable
etc) into the hdfs?
Thx
Shawna
Friday, February 17, 2012 at 6:18 AM, Erik Forsberg wrote:
> Hi!
>
> If I run a hadoop job that uses BulkOutputFormat to write data to
> Cassandra, and that hadoop job is aborted, i.e. streaming sessions are
> not completed, it seems like the streaming sessions hang around for a
&
Hi!
If I run a hadoop job that uses BulkOutputFormat to write data to
Cassandra, and that hadoop job is aborted, i.e. streaming sessions are
not completed, it seems like the streaming sessions hang around for a
very long time, I've observed at least 12-15h, in output from 'nodetool
On Mon, Jan 9, 2012 at 1:18 AM, Erik Forsberg wrote:
> Hi!
>
> Can the new BulkOutputFormat
> (https://issues.apache.org/jira/browse/CASSANDRA-3045) be used to load data
> to servers running cassandra 0.8.7 and/or Cassandra 1.0.6?
>
> I'm thinking of using jar files fr
Hi!
Can the new BulkOutputFormat
(https://issues.apache.org/jira/browse/CASSANDRA-3045) be used to load
data to servers running cassandra 0.8.7 and/or Cassandra 1.0.6?
I'm thinking of using jar files from the development version to load
data onto a production cluster which I want to ke
36 matches
Mail list logo