[ https://issues.apache.org/jira/browse/PHOENIX-2607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun Thangamani updated PHOENIX-2607: ------------------------------------- Attachment: region_server_2_jstack.txt phoenix_slow_map_process_jstack.txt hbase-rs04-slow-upload.log hbase-rs04-fast-upload.log hbase-rs03-slow-upload.log hbase-rs03-fast-upload.log hbase-rs02-slow-upload.log hbase-rs02-fast-upload.log hbase-rs01-slow-upload.log hbase-rs01-fast-upload.log hbase-master-slow-upload.log hbase-master-fast-upload.log > PhoenixMapReduceUtil Upserts with earlier ts (relative to latest data ts) > slower by 25x after stats collection > -------------------------------------------------------------------------------------------------------------- > > Key: PHOENIX-2607 > URL: https://issues.apache.org/jira/browse/PHOENIX-2607 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.4.0 > Reporter: Arun Thangamani > Attachments: hbase-master-fast-upload.log, > hbase-master-slow-upload.log, hbase-rs01-fast-upload.log, > hbase-rs01-slow-upload.log, hbase-rs02-fast-upload.log, > hbase-rs02-slow-upload.log, hbase-rs03-fast-upload.log, > hbase-rs03-slow-upload.log, hbase-rs04-fast-upload.log, > hbase-rs04-slow-upload.log, phoenix_slow_map_process_jstack.txt, > region_server_2_jstack.txt > > > Description of the problem: > 1) We face a 25x slow down when go back in time to load data in a table (when > specific timestamps set on connections during upserts) > 2) set phoenix.stats.useCurrentTime=false (and > phoenix.stats.guidepost.per.region 1) which at least makes the forward > timestamps upserts perform correctly > 3) From what I can tell from the phoenix source code, logs attached and > jstacks from the region servers -- we continuously try to lookup the uncached > definition of the table when we have client timestamp earlier than the last > modified timestamp of the table in stats > 4) To reproduce, create a table with timestamp=100, and load 10M rows with > PhoenixMapReduceUtil and timestamps=1447574400000,1448092800000, wait for 20 > mins (15+ min, phoenix.stats.updateFrequency is 15mins) > After 20 mins, load 10M rows with a earlier timestamp compared to the latest > data (timestamp=1447660800000) and observe the 25x slowness, after this once > again load a forward timestamp 1448179200000 and observe the quickness > 5) I was not able to reproduce this issue with simple multi threaded upserts > from a jdbc connection, with simple multi threaded upserts the stats table > never gets populated unlike PhoenixMapReduceUtil > We are trying to use phoenix as a cache store to do analytics with the last > 60 days of data, a total of about 1.5 billion rows > The table has a composite key and the data arrives in different times from > different sources, so it is easier to maintain the timestamps of the data and > expire the data automatically, this performance makes a difference between > inserting the data in 10 mins versus 2 hours, 2 hours for data inserts > blocking up the cluster that we have. > We are even talking about our use cases in the upcoming strata conference in > March.. (Thanks to excellent community) > Steps to reproduce: > Source code is available in > (https://github.com/athangamani/phoenix_mapreduce_timestamp_upsert) and the > jar the source code produces is attached which is readily runnable > 1) We use the following params to keep the stats collection happy to isolate > the specific issue > phoenix.stats.useCurrentTime false > phoenix.stats.guidepost.per.region 1 > 2) Create a table in phoenix > Run the following main class from the project.. > (StatPhoenixTableCreationTest).. It will create a table with timestamp=100 > CREATE TABLE stat_table ( > pk1 VARCHAR NOT NULL, > pk2 VARCHAR NOT NULL, > pk3 UNSIGNED_LONG NOT NULL, > stat1 UNSIGNED_LONG, > stat2 UNSIGNED_LONG, > stat3 UNSIGNED_LONG, > CONSTRAINT pk PRIMARY KEY (pk1, pk2, pk3) > ) SALT_BUCKETS=32, COMPRESSION='LZ4' > 3) Open the code base to look at the sample for PhoenixMapReduceUtil.. With > DBWritable.. > 4) Within the codebase, we get phoenix connection for the mappers using the > following settings in order to have a fixed client timestamp > conf.set(PhoenixRuntime.CURRENT_SCN_ATTRIB, ""+(timestamp)); > 5) fix the hbase-site.xml in the codebase for zookeeper quorum and hbase > parent znode info > 6) simply run the StatDataCreatorTest to create data for the run and load it > in hdfs for 10M records > 7) to run the ready made jar attached, use the following commands, > hadoop jar phoenix_mr_ts_upsert-jar-with-dependencies.jar > statPhoenixLoader hdfs:///user/*****/stat-data-1.txt STAT_TABLE 1447574400000 > hadoop jar phoenix_mr_ts_upsert-jar-with-dependencies.jar > statPhoenixLoader hdfs:///user/*****/stat-data-1.txt STAT_TABLE 1448092800000 > After 20 mins… > hadoop jar phoenix_mr_ts_upsert-jar-with-dependencies.jar > statPhoenixLoader hdfs:///user/*****/stat-data-1.txt STAT_TABLE 1447660800000 > hadoop jar phoenix_mr_ts_upsert-jar-with-dependencies.jar > statPhoenixLoader hdfs:///user/*****/stat-data-1.txt STAT_TABLE 1449000000000 > 8) observe the 25x slowness in the 3rd run and observe the normal pace in the > 4th run > 9) Attached are the region server logs for the fast process in forward > progression versus the slow progress in reverse timestamp -- This message was sent by Atlassian JIRA (v6.3.4#6332)