Hi, Huiliang, Great to hear from you, again! Image you have 3 nodes, replication factor=1, and using default number of tokens. You will have 3*256 mappers... In that case, you will be soon out of mappers or reach the limit.
On Tue, Jan 27, 2015 at 10:59 PM, Huiliang Zhang <zhl...@gmail.com> wrote: > Hi Shenghua, as I understand, each range is assigned to a mapper. Mapper > will not share connections. So, it needs at least 256 connections to read > all. But all 256 connections should not be set up at the same time unless > you have 256 mappers running at the same time. > > On Tue, Jan 27, 2015 at 9:34 PM, Shenghua(Daniel) Wan < > wansheng...@gmail.com> wrote: > >> By default, each C* node is set with 256 tokens. On a local 1-node C* >> server, my hadoop drop creates 256 connections to the server. Is there any >> way to control this behavior? e.g. reduce the number of connections to a >> pre-configured gap. >> >> I debugged C* source code and found the client asks for partition ranges, >> or virtual nodes. Then the client was told by server there were 257 ranges, >> corresponding to 257 column family splits. >> >> Here is a snapshot of my logs >> >> 15/01/27 18:02:20 DEBUG hadoop.AbstractColumnFamilyInputFormat: adding >> ColumnFamilySplit((9121856086738887846, '-9223372036854775808] @[localhost]) >> ... >> totally 257 splits. >> >> The problem is the user might only want all the data via a "select *" >> like statement. It seems that 257 connections to query the rows are >> necessary. However, is there any way to prohibit 257 concurrent >> connections? >> >> My C* version is 2.0.11 and I also tried CqlPagingInputFormat, which has >> same behavior. >> >> Thank you. >> >> -- >> >> Regards, >> Shenghua (Daniel) Wan >> > > -- Regards, Shenghua (Daniel) Wan