sorting with sstableloader

2014-01-24 Thread varun allampalli
Hi,

I have a table with schema

CREATE TABLE TEST_TABLE (

  keyCol bigint,

  col1 bigint,

  col2 bigint,

  col3 text,

 ) WITH CLUSTERING ORDER BY (col1 DESC, col2 DESC)

on cassandra 1.2.13


I used SSTableSimpleUnsortedWriter and sstableloader to load some data and
loaded data for a keycolumn keyCol1

select * from test_table returned

 keyCol col1 col2col3

keyCol1101  abcde

keyCol1 9 1  adfsa


I again created some sstables and loaded using sstableloader which has
records for keyCol1, but with col1 value as 20 which is

keyCol col1 col2col3

keyCol120 1   afd

So when I queried select * from test_table I was expecting

keyCol120 1   afd
keyCol1101  abcde

keyCol1 9 1  adfsa


but it returned
keyCol1101  abcde

keyCol1 9 1  adfsa

keyCol120 1   afd
How to fix this sorting, the col1 of value 20 was inserted later using
sstableloader so it is showing up as the last row.

Is there anyway to rewrite sstables to fix this sorting, or do I need to
run anything after running sstable loader to fix this sorting.

Thanks
Varun


SSTableloader

2013-12-26 Thread varun allampalli
Hi,

I am trying to load using SSTableloader with cassandra 1.2 version like a
million records. It streams very fast, but in the end its streaming gets
stuck at two three machines in the cluster, rest all are 100% done.

Has anybody seen such a problem and is there any tool I can use to diagnose
this loading.

Thanks in advance.

Varun


Re: Bulkoutputformat

2013-12-13 Thread varun allampalli
Thanks Rahul..article was insightful


On Fri, Dec 13, 2013 at 12:25 AM, Rahul Menon  wrote:

> Here you go
>
> http://thelastpickle.com/blog/2013/01/11/primary-keys-in-cql.html
>
>
> On Fri, Dec 13, 2013 at 7:19 AM, varun allampalli <
> vshoori.off...@gmail.com> wrote:
>
>> Hi Aaron,
>>
>> It seems like you answered the question here.
>>
>> https://groups.google.com/forum/#!topic/nosql-databases/vjZA5vdycWA
>>
>> Can you give me the link to the blog which you mentioned
>>
>> http://thelastpickle.com/2013/01/11/primary-keys-in-cql/
>>
>> Thanks in advance
>> Varun
>>
>>
>> On Thu, Dec 12, 2013 at 3:36 PM, varun allampalli <
>> vshoori.off...@gmail.com> wrote:
>>
>>> Thanks Aaron, I was able to generate sstables and load using
>>> sstableloader. But after loading the tables when I do a select query I get
>>> this, the table has only one record. Is there anything I am missing or any
>>> logs I can look at.
>>>
>>> Request did not complete within rpc_timeout.
>>>
>>>
>>> On Wed, Dec 11, 2013 at 7:58 PM, Aaron Morton 
>>> wrote:
>>>
>>>> If you don’t need to use Hadoop then try the SSTableSimpleWriter and
>>>> sstableloader , this post is a little old but still relevant
>>>> http://www.datastax.com/dev/blog/bulk-loading
>>>>
>>>> Otherwise AFAIK BulkOutputFormat is what you want from hadoop
>>>> http://www.datastax.com/docs/1.1/cluster_architecture/hadoop_integration
>>>>
>>>> Cheers
>>>>
>>>>  -
>>>> Aaron Morton
>>>> New Zealand
>>>> @aaronmorton
>>>>
>>>> Co-Founder & Principal Consultant
>>>> Apache Cassandra Consulting
>>>> http://www.thelastpickle.com
>>>>
>>>> On 12/12/2013, at 11:27 am, varun allampalli 
>>>> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I want to bulk insert data into cassandra. I was wondering of using
>>>> BulkOutputformat in hadoop. Is it the best way or using driver and doing
>>>> batch insert is the better way.
>>>>
>>>> Are there any disandvantages of using bulkoutputformat.
>>>>
>>>> Thanks for helping
>>>>
>>>> Varun
>>>>
>>>>
>>>>
>>>
>>
>


Re: Bulkoutputformat

2013-12-12 Thread varun allampalli
Hi Aaron,

It seems like you answered the question here.

https://groups.google.com/forum/#!topic/nosql-databases/vjZA5vdycWA

Can you give me the link to the blog which you mentioned

http://thelastpickle.com/2013/01/11/primary-keys-in-cql/

Thanks in advance
Varun


On Thu, Dec 12, 2013 at 3:36 PM, varun allampalli
wrote:

> Thanks Aaron, I was able to generate sstables and load using
> sstableloader. But after loading the tables when I do a select query I get
> this, the table has only one record. Is there anything I am missing or any
> logs I can look at.
>
> Request did not complete within rpc_timeout.
>
>
> On Wed, Dec 11, 2013 at 7:58 PM, Aaron Morton wrote:
>
>> If you don’t need to use Hadoop then try the SSTableSimpleWriter and
>> sstableloader , this post is a little old but still relevant
>> http://www.datastax.com/dev/blog/bulk-loading
>>
>> Otherwise AFAIK BulkOutputFormat is what you want from hadoop
>> http://www.datastax.com/docs/1.1/cluster_architecture/hadoop_integration
>>
>> Cheers
>>
>>  -
>> Aaron Morton
>> New Zealand
>> @aaronmorton
>>
>> Co-Founder & Principal Consultant
>> Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>> On 12/12/2013, at 11:27 am, varun allampalli 
>> wrote:
>>
>> Hi All,
>>
>> I want to bulk insert data into cassandra. I was wondering of using
>> BulkOutputformat in hadoop. Is it the best way or using driver and doing
>> batch insert is the better way.
>>
>> Are there any disandvantages of using bulkoutputformat.
>>
>> Thanks for helping
>>
>> Varun
>>
>>
>>
>


Re: Bulkoutputformat

2013-12-12 Thread varun allampalli
Thanks Aaron, I was able to generate sstables and load using sstableloader.
But after loading the tables when I do a select query I get this, the table
has only one record. Is there anything I am missing or any logs I can look
at.

Request did not complete within rpc_timeout.


On Wed, Dec 11, 2013 at 7:58 PM, Aaron Morton wrote:

> If you don’t need to use Hadoop then try the SSTableSimpleWriter and
> sstableloader , this post is a little old but still relevant
> http://www.datastax.com/dev/blog/bulk-loading
>
> Otherwise AFAIK BulkOutputFormat is what you want from hadoop
> http://www.datastax.com/docs/1.1/cluster_architecture/hadoop_integration
>
> Cheers
>
> -
> Aaron Morton
> New Zealand
> @aaronmorton
>
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> On 12/12/2013, at 11:27 am, varun allampalli 
> wrote:
>
> Hi All,
>
> I want to bulk insert data into cassandra. I was wondering of using
> BulkOutputformat in hadoop. Is it the best way or using driver and doing
> batch insert is the better way.
>
> Are there any disandvantages of using bulkoutputformat.
>
> Thanks for helping
>
> Varun
>
>
>


Bulkoutputformat

2013-12-11 Thread varun allampalli
Hi All,

I want to bulk insert data into cassandra. I was wondering of using
BulkOutputformat in hadoop. Is it the best way or using driver and doing
batch insert is the better way.

Are there any disandvantages of using bulkoutputformat.

Thanks for helping

Varun