Thank you very much for thinking of this. I do not have such files. I will file a bug as per your suggestion.
On Monday, March 14, 2016, Prasanth Jayachandran < pjayachand...@hortonworks.com> wrote: > Hi Marcin > > I came across this issue recently. Do you have old orc files (created with > hive 0.11) in the table/partition? If so this patch is required > > https://issues.apache.org/jira/browse/HIVE-13285 > > Thanks > Prasanth > > On Mar 10, 2016, at 5:02 PM, Prasanth Jayachandran < > pjayachand...@hortonworks.com > <javascript:_e(%7B%7D,'cvml','pjayachand...@hortonworks.com');>> wrote: > > After hive 1.2.1 there is one patch that went in related to alter table > concatenation. https://issues.apache.org/jira/browse/HIVE-12450 > > I am not sure if its related though. Could you please file a bug for this? > It will be great if you can attach a small enough repro for this issue. I > can verify it and provide a fix in case of bug. > > Thanks > Prasanth > > On Mar 8, 2016, at 5:52 AM, Marcin Tustin <mtus...@handybook.com > <javascript:_e(%7B%7D,'cvml','mtus...@handybook.com');>> wrote: > > Hi Mich, > > ddl as below. > > Hi Prasanth, > > Hive version as reported by Hortonworks is 1.2.1.2.3. > > Thanks, > Marcin > > CREATE TABLE `<tablename>`( > > `col1` string, > > `col2` bigint, > > `col3` string, > > `col4` string, > > `col4` string, > > `col5` bigint, > > `col6` string, > > `col7` string, > > `col8` string, > > `col9` string, > > `col10` boolean, > > `col11` boolean, > > `col12` string, > > `metadata` > struct<file:string,hostname:string,level:string,line:bigint,logger:string,method:string,millis:bigint,pid:bigint,timestamp:string>, > > `col14` string, > > `col15` bigint, > > `col16` double, > > `col17` bigint) > > ROW FORMAT SERDE > > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' > > STORED AS INPUTFORMAT > > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' > > OUTPUTFORMAT > > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' > > LOCATION > > 'hdfs://reporting-handy/<path>' > > TBLPROPERTIES ( > > 'COLUMN_STATS_ACCURATE'='true', > > 'numFiles'='2800', > > 'numRows'='297263', > > 'rawDataSize'='454748401', > > 'totalSize'='31310353', > > 'transient_lastDdlTime'='1457437204') > > Time taken: 1.062 seconds, Fetched: 34 row(s) > > On Tue, Mar 8, 2016 at 4:29 AM, Mich Talebzadeh <mich.talebza...@gmail.com > <javascript:_e(%7B%7D,'cvml','mich.talebza...@gmail.com');>> wrote: > >> Hi >> >> can you please provide DDL for this table "show create table <TABLE>" >> >> Dr Mich Talebzadeh >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> http://talebzadehmich.wordpress.com >> >> >> >> On 7 March 2016 at 23:25, Marcin Tustin <mtus...@handybook.com >> <javascript:_e(%7B%7D,'cvml','mtus...@handybook.com');>> wrote: >> >>> Hi All, >>> >>> Following on from from our parquet vs orc discussion, today I observed >>> hive's alter table ... concatenate command remove rows from an ORC >>> formatted table. >>> >>> 1. Has anyone else observed this (fuller description below)? And >>> 2. How to do parquet users handle the file fragmentation issue? >>> >>> Description of the problem: >>> >>> Today I ran a query to count rows by date. Relevant days below: >>> 2016-02-28 16866 >>> 2016-03-06 219 >>> 2016-03-07 2863 >>> I then ran concatenation on that table. Rerunning the same query >>> resulted in: >>> >>> 2016-02-28 16866 >>> 2016-03-06 219 >>> 2016-03-07 1158 >>> >>> Note reduced count for 2016-03-07 >>> >>> I then ran concatenation a second time, and the query a third time: >>> 2016-02-28 16344 >>> 2016-03-06 219 >>> 2016-03-07 1158 >>> >>> Now the count for 2016-02-28 is reduced. >>> >>> This doesn't look like an elimination of duplicates occurring by design >>> - these didn't all happen on the first run of concatenation. It looks like >>> concatenation just kind of loses data. >>> >>> >>> >>> Want to work at Handy? Check out our culture deck and open roles >>> <http://www.handy.com/careers> >>> Latest news <http://www.handy.com/press> at Handy >>> Handy just raised $50m >>> <http://venturebeat.com/2015/11/02/on-demand-home-service-handy-raises-50m-in-round-led-by-fidelity/> >>> led >>> by Fidelity >>> >>> >> > > Want to work at Handy? Check out our culture deck and open roles > <http://www.handy.com/careers> > Latest news <http://www.handy.com/press> at Handy > Handy just raised $50m > <http://venturebeat.com/2015/11/02/on-demand-home-service-handy-raises-50m-in-round-led-by-fidelity/> > led > by Fidelity > > > > -- Want to work at Handy? Check out our culture deck and open roles <http://www.handy.com/careers> Latest news <http://www.handy.com/press> at Handy Handy just raised $50m <http://venturebeat.com/2015/11/02/on-demand-home-service-handy-raises-50m-in-round-led-by-fidelity/> led by Fidelity