Re: Index population over table contains 2.3 x 10^10 records

2018-03-22 Thread Margusja
Great hint! Looks like it helped! 

What a great power of community!

Br, Margus

> On 22 Mar 2018, at 18:24, Josh Elser  wrote:
> 
> Hard to say at a glance, but this issue is happening down in the MapReduce 
> framework, not in Phoenix itself.
> 
> It looks similar to problems I've seen many years ago around 
> mapreduce.task.io.sort.mb. You can try reducing that value. It also may be 
> related to a bug in your Hadoop version.
> 
> Good luck!
> 
> On 3/22/18 4:37 AM, Margusja wrote:
>> Hi
>> Needed to recreate indexes over main table contains more than 2.3 x 10^10 
>> records.
>> I used ASYNC and org.apache.phoenix.mapreduce.index.IndexTool
>> One index succeed but another gives stack:
>> 2018-03-20 13:23:16,723 FATAL [IPC Server handler 0 on 43926] 
>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
>> attempt_1521544097253_0004_m_08_0 - exited : 
>> java.lang.ArrayIndexOutOfBoundsException at 
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1453)
>>  at 
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1349)
>>  at java.io.DataOutputStream.writeInt(DataOutputStream.java:197) at 
>> org.apache.hadoop.hbase.io.ImmutableBytesWritable.write(ImmutableBytesWritable.java:159)
>>  at 
>> org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:98)
>>  at 
>> org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:82)
>>  at 
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1149) 
>> at 
>> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:715) 
>> at 
>> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
>>  at 
>> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
>>  at 
>> org.apache.phoenix.mapreduce.index.PhoenixIndexImportMapper.map(PhoenixIndexImportMapper.java:114)
>>  at 
>> org.apache.phoenix.mapreduce.index.PhoenixIndexImportMapper.map(PhoenixIndexImportMapper.java:48)
>>  at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at 
>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at 
>> org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at 
>> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170) at 
>> java.security.AccessController.doPrivileged(Native Method) at 
>> javax.security.auth.Subject.doAs(Subject.java:422) at 
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
>>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
>> Is here any best practice how to deal with situations like this?
>> Br, Margus



Re: phoenix table with 50 salt buckets ( regions ) - now shows as 68 regions and18 of them stale

2018-03-22 Thread Adi Kadimetla
I did not set any split policy. I was under assumption
'hbase.hregion.max.filesize' => '107374182400' ( 100 GB ) this property
would take care and size was also with in 33 GB.

I want to understand even if split happens it's of no use, as the first
salt byte 1 - 49 are used for putting the keys into the region. Do we need
CostantSizeRegionSplitPolicy for pre split regions in this case using salt.


On Thu, Mar 22, 2018 at 5:21 PM, Jonathan Leech  wrote:

> Did you set the split policy to CostantSizeRegionSplitPolicy?
>
> > On Mar 22, 2018, at 2:56 PM, Adi Kadimetla  wrote:
> >
> > Group,
> > TABLE - with 50 salt buckets and configured as time series table.
> >
> > Having pre split into 50 SALT buckets we disabled the region splits
> using max file size as 100 GB for the split.
> >
> > I see some of the keys got split and created stale regions.
> >
> > no writes are happening into the region f3f0a711370c8acb88f5294ec9dfa648
> >
> > region - f3f0a711370c8acb88f5294ec9dfa648
> > start key - .\x00x\00 ... ( all x\00 )
> > end key - .\x80\x00\x01`\x02<8\x07\x00\x00\x00\x00\x81\x80\x00\x01`\
> x02<\x14`\x00\x00\x00\x005VPK06CewQV2nI_1qS6baA\x00oAPASXODoBkIBIYJT3BPrQ
> >
> >
> > All the writes are going to this region 7feabc1eea26727c03366efba27f20
> 16.
> >
> > region - 7feabc1eea26727c03366efba27f2016
> > start key - .\x80\x00\x01`\x02<8\x07\x00\x00\x00\x00\x81\x80\x00\x01`\
> x02<\x14`\x00\x00\x00\x005VPK06CewQV2nI_1qS6baA\x00oAPASXODoBkIBIYJT3BPrQ
> >
> > end key - /\x00\x00 ... ( all x\00 )
> >
> >
> > salt bucket byte - hash(row key) % N salt buckets
> >
> >
> > I was expecting only salt bucket bytes for the start and end keys.
> Instead I see row keys have been appended to the salt bucket byte and those
> regions are not used.
> >
> > Any pointer what caused the split and how to prevent.
> >
> > Command to remove stale region and restore as contiguous start and end
> keys.
> >
> > region - 7feabc1eea26727c03366efba27f2016
> >
> > start key - .\x00\x00 .. ( all x\00 )
> >
> > end key - /\x00\x00 ... ( all x\00 )
> >
> >
> > HDP  - 2.6.3
> > HBase 1.1.2
> > Phoenix 4.7.0
> >
> >
> > Thanks
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>


Re: phoenix table with 50 salt buckets ( regions ) - now shows as 68 regions and18 of them stale

2018-03-22 Thread Jonathan Leech
Did you set the split policy to CostantSizeRegionSplitPolicy?

> On Mar 22, 2018, at 2:56 PM, Adi Kadimetla  wrote:
> 
> Group,
> TABLE - with 50 salt buckets and configured as time series table.
> 
> Having pre split into 50 SALT buckets we disabled the region splits using max 
> file size as 100 GB for the split.
> 
> I see some of the keys got split and created stale regions.
> 
> no writes are happening into the region f3f0a711370c8acb88f5294ec9dfa648
> 
> region - f3f0a711370c8acb88f5294ec9dfa648  
> start key - .\x00x\00 ... ( all x\00 )
> end key - 
> .\x80\x00\x01`\x02<8\x07\x00\x00\x00\x00\x81\x80\x00\x01`\x02<\x14`\x00\x00\x00\x005VPK06CewQV2nI_1qS6baA\x00oAPASXODoBkIBIYJT3BPrQ
> 
> 
> All the writes are going to this region 7feabc1eea26727c03366efba27f2016.
> 
> region - 7feabc1eea26727c03366efba27f2016
> start key - 
> .\x80\x00\x01`\x02<8\x07\x00\x00\x00\x00\x81\x80\x00\x01`\x02<\x14`\x00\x00\x00\x005VPK06CewQV2nI_1qS6baA\x00oAPASXODoBkIBIYJT3BPrQ
> 
> end key - /\x00\x00 ... ( all x\00 )
> 
> 
> salt bucket byte - hash(row key) % N salt buckets
> 
> 
> I was expecting only salt bucket bytes for the start and end keys. Instead I 
> see row keys have been appended to the salt bucket byte and those regions are 
> not used. 
> 
> Any pointer what caused the split and how to prevent.
> 
> Command to remove stale region and restore as contiguous start and end keys. 
> 
> region - 7feabc1eea26727c03366efba27f2016
> 
> start key - .\x00\x00 .. ( all x\00 )
> 
> end key - /\x00\x00 ... ( all x\00 )
> 
> 
> HDP  - 2.6.3
> HBase 1.1.2
> Phoenix 4.7.0
> 
> 
> Thanks
> 
> 
> 
> 
> 
> 
> 
> 
>  
> 


phoenix table with 50 salt buckets ( regions ) - now shows as 68 regions and18 of them stale

2018-03-22 Thread Adi Kadimetla
Group,
TABLE - with 50 salt buckets and configured as time series table.

Having pre split into 50 SALT buckets we disabled the region splits using
max file size as 100 GB for the split.

I see some of the keys got split and created stale regions.

no writes are happening into the region f3f0a711370c8acb88f5294ec9dfa648

region - f3f0a711370c8acb88f5294ec9dfa648
start key - .\x00x\00 ... ( all x\00 )
end key -
.\x80\x00\x01`\x02<8\x07\x00\x00\x00\x00\x81\x80\x00\x01`\x02<\x14`\x00\x00\x00\x005VPK06CewQV2nI_1qS6baA\x00oAPASXODoBkIBIYJT3BPrQ


All the writes are going to this region 7feabc1eea26727c03366efba27f2016.

region - 7feabc1eea26727c03366efba27f2016
start key -
.\x80\x00\x01`\x02<8\x07\x00\x00\x00\x00\x81\x80\x00\x01`\x02<\x14`\x00\x00\x00\x005VPK06CewQV2nI_1qS6baA\x00oAPASXODoBkIBIYJT3BPrQ

end key - /\x00\x00 ... ( all x\00 )


salt bucket byte - hash(row key) % N salt buckets


I was expecting only salt bucket bytes for the start and end keys. Instead
I see row keys have been appended to the salt bucket byte and those regions
are not used.

Any pointer what caused the split and how to prevent.

Command to remove stale region and restore as contiguous start and end
keys.

region - 7feabc1eea26727c03366efba27f2016

start key - .\x00\x00 .. ( all x\00 )

end key - /\x00\x00 ... ( all x\00 )


HDP  - 2.6.3
HBase 1.1.2
Phoenix 4.7.0


Thanks


Re: Phoenix Exception - Connection is null or closed.

2018-03-22 Thread Josh Elser

Hey Anil,

You sure there isn't another exception earlier on in the output of your 
application? The exception you have here looks more like the JVM was 
already shutting down and Phoenix had closed the connection (the 
exceptions were about queued tasks being cleared out after the decision 
to shutdown was already made).


On 3/22/18 8:14 AM, Anil wrote:

HI Team,

We have upgraded the phoenix from 4.7.0 to 4.11.0 and started noticing 
the attached exception.


Can you help me identifying the root cause of the exception ? Thanks.

Regards,
Anil


Re: Index population over table contains 2.3 x 10^10 records

2018-03-22 Thread Josh Elser
Hard to say at a glance, but this issue is happening down in the 
MapReduce framework, not in Phoenix itself.


It looks similar to problems I've seen many years ago around 
mapreduce.task.io.sort.mb. You can try reducing that value. It also may 
be related to a bug in your Hadoop version.


Good luck!

On 3/22/18 4:37 AM, Margusja wrote:

Hi

Needed to recreate indexes over main table contains more thanĀ 2.3 x 
10^10 records.

I used ASYNC and org.apache.phoenix.mapreduce.index.IndexTool


One index succeed but another gives stack:

2018-03-20 13:23:16,723 FATAL [IPC Server handler 0 on 43926] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1521544097253_0004_m_08_0 - exited : 
java.lang.ArrayIndexOutOfBoundsException at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1453) 
at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1349) 
at java.io.DataOutputStream.writeInt(DataOutputStream.java:197) at 
org.apache.hadoop.hbase.io.ImmutableBytesWritable.write(ImmutableBytesWritable.java:159) 
at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:98) 
at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:82) 
at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1149) 
at 
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:715) 
at 
org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) 
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) 
at 
org.apache.phoenix.mapreduce.index.PhoenixIndexImportMapper.map(PhoenixIndexImportMapper.java:114) 
at 
org.apache.phoenix.mapreduce.index.PhoenixIndexImportMapper.map(PhoenixIndexImportMapper.java:48) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at 
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at 
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) 
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)



Is here any best practice how to deal with situations like this?

Br, Margus


Phoenix Exception - Connection is null or closed.

2018-03-22 Thread Anil
HI Team,

We have upgraded the phoenix from 4.7.0 to 4.11.0 and started noticing the
attached exception.

Can you help me identifying the root cause of the exception ? Thanks.

Regards,
Anil
2018-03-21 08:13:19,684 ERROR 
com.tst.hadoop.flume.writer.inventory.AccountPersistenceImpl: Error querying 
account UUIDs for siteId 0101293035
org.springframework.dao.DataAccessResourceFailureException: 
PreparedStatementCallback; SQL [select uuids from account where siteid = ?]; 
java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: 
Task 
org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@5ddc0ec5
 rejected from java.util.concurrent.ThreadPoolExecutor@5204afb0[Shutting down, 
pool size = 59, active threads = 0, queued tasks = 0, completed tasks = 
2667625]; nested exception is org.apache.phoenix.exception.PhoenixIOException: 
java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: 
Task 
org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@5ddc0ec5
 rejected from java.util.concurrent.ThreadPoolExecutor@5204afb0[Shutting down, 
pool size = 59, active threads = 0, queued tasks = 0, completed tasks = 2667625]
at 
org.springframework.jdbc.support.SQLStateSQLExceptionTranslator.doTranslate(SQLStateSQLExceptionTranslator.java:105)
at 
org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73)
at 
org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81)
at 
org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81)
at 
org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:660)
at 
org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:695)
at 
org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:722)
at 
org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:772)
at 
org.springframework.jdbc.core.namedparam.NamedParameterJdbcTemplate.query(NamedParameterJdbcTemplate.java:192)
at 
org.springframework.jdbc.core.namedparam.NamedParameterJdbcTemplate.queryForList(NamedParameterJdbcTemplate.java:290)
at 
com.tst.hadoop.flume.writer.inventory.AccountPersistenceImpl.getUUIDs(AccountPersistenceImpl.java:187)
at 
net.juniper.spark.stream.sap.data.mapper.ServiceContractMapper.populatePartnerAccountId(ServiceContractMapper.java:134)
at 
net.juniper.spark.stream.sap.data.mapper.ServiceContractMapper.map(ServiceContractMapper.java:70)
at 
net.juniper.spark.stream.sap.data.mapper.ServiceContractMapper.map(ServiceContractMapper.java:39)
at 
net.juniper.spark.stream.sap.processor.SAPDataProcessor.mapObject(SAPDataProcessor.java:34)
at 
net.juniper.spark.stream.sap.processor.SAPDataProcessor.processData(SAPDataProcessor.java:43)
at 
com.tst.hadoop.flume.sink.SAPContractCustomSink.process(SAPContractCustomSink.java:113)
at 
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.phoenix.exception.PhoenixIOException: 
java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: 
Task 
org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@5ddc0ec5
 rejected from java.util.concurrent.ThreadPoolExecutor@5204afb0[Shutting down, 
pool size = 59, active threads = 0, queued tasks = 0, completed tasks = 2667625]
at 
org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:116)
at 
org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:875)
at 
org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:819)
at 
org.apache.phoenix.iterate.RoundRobinResultIterator.getIterators(RoundRobinResultIterator.java:176)
at 
org.apache.phoenix.iterate.RoundRobinResultIterator.next(RoundRobinResultIterator.java:91)
at 
org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:778)
at 
org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
at 
org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
at 
org.springframework.jdbc.core.RowMapperResultSetExtractor.extractData(RowMapperResultSetExtractor.java:92)
at 
org.springframework.jdbc.core.RowMapperResultSetExtractor.extractData(RowMapperResultSetExtractor.java:60)
at 
org.springframework.jdbc.core.JdbcTemplate$1.doInPreparedStatement(JdbcTemplate.java:708)
at 
org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:644)
... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 

Index population over table contains 2.3 x 10^10 records

2018-03-22 Thread Margusja
Hi 

Needed to recreate indexes over main table contains more than 2.3 x 10^10 
records.
I used ASYNC and org.apache.phoenix.mapreduce.index.IndexTool


One index succeed but another gives stack:

2018-03-20 13:23:16,723 FATAL [IPC Server handler 0 on 43926] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1521544097253_0004_m_08_0 - exited : 
java.lang.ArrayIndexOutOfBoundsException
 at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1453)
 at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1349)
 at java.io.DataOutputStream.writeInt(DataOutputStream.java:197)
 at 
org.apache.hadoop.hbase.io.ImmutableBytesWritable.write(ImmutableBytesWritable.java:159)
 at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:98)
 at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:82)
 at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1149)
 at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:715)
 at 
org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
 at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
 at 
org.apache.phoenix.mapreduce.index.PhoenixIndexImportMapper.map(PhoenixIndexImportMapper.java:114)
 at 
org.apache.phoenix.mapreduce.index.PhoenixIndexImportMapper.map(PhoenixIndexImportMapper.java:48)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)


Is here any best practice how to deal with situations like this?

Br, Margus