Re: cqlinputformat and retired cqlpagingingputformat creates lots of connections to query the server

Shenghua(Daniel) Wan Tue, 27 Jan 2015 23:05:31 -0800

Hi, Huiliang,
Great to hear from you, again!
Image you have 3 nodes, replication factor=1, and using default number of
tokens. You will have 3*256 mappers... In that case, you will be soon out
of mappers or reach the limit.



On Tue, Jan 27, 2015 at 10:59 PM, Huiliang Zhang <zhl...@gmail.com> wrote:

> Hi Shenghua, as I understand, each range is assigned to a mapper. Mapper
> will not share connections. So, it needs at least 256 connections to read
> all. But all 256 connections should not be set up at the same time unless
> you have 256 mappers running at the same time.
>
> On Tue, Jan 27, 2015 at 9:34 PM, Shenghua(Daniel) Wan <
> wansheng...@gmail.com> wrote:
>
>> By default, each C* node is set with 256 tokens. On a local 1-node C*
>> server, my hadoop drop creates 256 connections to the server. Is there any
>> way to control this behavior? e.g. reduce the number of connections to a
>> pre-configured gap.
>>
>> I debugged C* source code and found the client asks for partition ranges,
>> or virtual nodes. Then the client was told by server there were 257 ranges,
>> corresponding to 257 column family splits.
>>
>> Here is a snapshot of my logs
>>
>> 15/01/27 18:02:20 DEBUG hadoop.AbstractColumnFamilyInputFormat: adding
>> ColumnFamilySplit((9121856086738887846, '-9223372036854775808] @[localhost])
>> ...
>> totally 257 splits.
>>
>> The problem is the user might only want all the data via a "select *"
>> like statement. It seems that 257 connections to query the rows are
>> necessary. However, is there any way to prohibit 257 concurrent
>> connections?
>>
>> My C* version is 2.0.11 and I also tried CqlPagingInputFormat, which has
>> same behavior.
>>
>> Thank you.
>>
>> --
>>
>> Regards,
>> Shenghua (Daniel) Wan
>>
>
>


-- 

Regards,
Shenghua (Daniel) Wan

Re: cqlinputformat and retired cqlpagingingputformat creates lots of connections to query the server

Reply via email to