sorting with sstableloader

2014-01-24 Thread varun allampalli
Hi,

I have a table with schema

CREATE TABLE TEST_TABLE (

  keyCol bigint,

  col1 bigint,

  col2 bigint,

  col3 text,

 ) WITH CLUSTERING ORDER BY (col1 DESC, col2 DESC)

on cassandra 1.2.13


I used SSTableSimpleUnsortedWriter and sstableloader to load some data and
loaded data for a keycolumn keyCol1

select * from test_table returned

 keyCol col1 col2col3

keyCol1101  abcde

keyCol1 9 1  adfsa


I again created some sstables and loaded using sstableloader which has
records for keyCol1, but with col1 value as 20 which is

keyCol col1 col2col3

keyCol120 1   afd

So when I queried select * from test_table I was expecting

keyCol120 1   afd
keyCol1101  abcde

keyCol1 9 1  adfsa


but it returned
keyCol1101  abcde

keyCol1 9 1  adfsa

keyCol120 1   afd
How to fix this sorting, the col1 of value 20 was inserted later using
sstableloader so it is showing up as the last row.

Is there anyway to rewrite sstables to fix this sorting, or do I need to
run anything after running sstable loader to fix this sorting.

Thanks
Varun


SSTableloader

2013-12-26 Thread varun allampalli
Hi,

I am trying to load using SSTableloader with cassandra 1.2 version like a
million records. It streams very fast, but in the end its streaming gets
stuck at two three machines in the cluster, rest all are 100% done.

Has anybody seen such a problem and is there any tool I can use to diagnose
this loading.

Thanks in advance.

Varun


Re: Bulkoutputformat

2013-12-13 Thread varun allampalli
Thanks Rahul..article was insightful


On Fri, Dec 13, 2013 at 12:25 AM, Rahul Menon ra...@apigee.com wrote:

 Here you go

 http://thelastpickle.com/blog/2013/01/11/primary-keys-in-cql.html


 On Fri, Dec 13, 2013 at 7:19 AM, varun allampalli 
 vshoori.off...@gmail.com wrote:

 Hi Aaron,

 It seems like you answered the question here.

 https://groups.google.com/forum/#!topic/nosql-databases/vjZA5vdycWA

 Can you give me the link to the blog which you mentioned

 http://thelastpickle.com/2013/01/11/primary-keys-in-cql/

 Thanks in advance
 Varun


 On Thu, Dec 12, 2013 at 3:36 PM, varun allampalli 
 vshoori.off...@gmail.com wrote:

 Thanks Aaron, I was able to generate sstables and load using
 sstableloader. But after loading the tables when I do a select query I get
 this, the table has only one record. Is there anything I am missing or any
 logs I can look at.

 Request did not complete within rpc_timeout.


 On Wed, Dec 11, 2013 at 7:58 PM, Aaron Morton 
 aa...@thelastpickle.comwrote:

 If you don’t need to use Hadoop then try the SSTableSimpleWriter and
 sstableloader , this post is a little old but still relevant
 http://www.datastax.com/dev/blog/bulk-loading

 Otherwise AFAIK BulkOutputFormat is what you want from hadoop
 http://www.datastax.com/docs/1.1/cluster_architecture/hadoop_integration

 Cheers

  -
 Aaron Morton
 New Zealand
 @aaronmorton

 Co-Founder  Principal Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com

 On 12/12/2013, at 11:27 am, varun allampalli vshoori.off...@gmail.com
 wrote:

 Hi All,

 I want to bulk insert data into cassandra. I was wondering of using
 BulkOutputformat in hadoop. Is it the best way or using driver and doing
 batch insert is the better way.

 Are there any disandvantages of using bulkoutputformat.

 Thanks for helping

 Varun








Re: Bulkoutputformat

2013-12-12 Thread varun allampalli
Thanks Aaron, I was able to generate sstables and load using sstableloader.
But after loading the tables when I do a select query I get this, the table
has only one record. Is there anything I am missing or any logs I can look
at.

Request did not complete within rpc_timeout.


On Wed, Dec 11, 2013 at 7:58 PM, Aaron Morton aa...@thelastpickle.comwrote:

 If you don’t need to use Hadoop then try the SSTableSimpleWriter and
 sstableloader , this post is a little old but still relevant
 http://www.datastax.com/dev/blog/bulk-loading

 Otherwise AFAIK BulkOutputFormat is what you want from hadoop
 http://www.datastax.com/docs/1.1/cluster_architecture/hadoop_integration

 Cheers

 -
 Aaron Morton
 New Zealand
 @aaronmorton

 Co-Founder  Principal Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com

 On 12/12/2013, at 11:27 am, varun allampalli vshoori.off...@gmail.com
 wrote:

 Hi All,

 I want to bulk insert data into cassandra. I was wondering of using
 BulkOutputformat in hadoop. Is it the best way or using driver and doing
 batch insert is the better way.

 Are there any disandvantages of using bulkoutputformat.

 Thanks for helping

 Varun





Re: Bulkoutputformat

2013-12-12 Thread varun allampalli
Hi Aaron,

It seems like you answered the question here.

https://groups.google.com/forum/#!topic/nosql-databases/vjZA5vdycWA

Can you give me the link to the blog which you mentioned

http://thelastpickle.com/2013/01/11/primary-keys-in-cql/

Thanks in advance
Varun


On Thu, Dec 12, 2013 at 3:36 PM, varun allampalli
vshoori.off...@gmail.comwrote:

 Thanks Aaron, I was able to generate sstables and load using
 sstableloader. But after loading the tables when I do a select query I get
 this, the table has only one record. Is there anything I am missing or any
 logs I can look at.

 Request did not complete within rpc_timeout.


 On Wed, Dec 11, 2013 at 7:58 PM, Aaron Morton aa...@thelastpickle.comwrote:

 If you don’t need to use Hadoop then try the SSTableSimpleWriter and
 sstableloader , this post is a little old but still relevant
 http://www.datastax.com/dev/blog/bulk-loading

 Otherwise AFAIK BulkOutputFormat is what you want from hadoop
 http://www.datastax.com/docs/1.1/cluster_architecture/hadoop_integration

 Cheers

  -
 Aaron Morton
 New Zealand
 @aaronmorton

 Co-Founder  Principal Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com

 On 12/12/2013, at 11:27 am, varun allampalli vshoori.off...@gmail.com
 wrote:

 Hi All,

 I want to bulk insert data into cassandra. I was wondering of using
 BulkOutputformat in hadoop. Is it the best way or using driver and doing
 batch insert is the better way.

 Are there any disandvantages of using bulkoutputformat.

 Thanks for helping

 Varun






Bulkoutputformat

2013-12-11 Thread varun allampalli
Hi All,

I want to bulk insert data into cassandra. I was wondering of using
BulkOutputformat in hadoop. Is it the best way or using driver and doing
batch insert is the better way.

Are there any disandvantages of using bulkoutputformat.

Thanks for helping

Varun