Hi can you please provide DDL for this table "show create table <TABLE>"
Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 7 March 2016 at 23:25, Marcin Tustin <mtus...@handybook.com> wrote: > Hi All, > > Following on from from our parquet vs orc discussion, today I observed > hive's alter table ... concatenate command remove rows from an ORC > formatted table. > > 1. Has anyone else observed this (fuller description below)? And > 2. How to do parquet users handle the file fragmentation issue? > > Description of the problem: > > Today I ran a query to count rows by date. Relevant days below: > 2016-02-28 16866 > 2016-03-06 219 > 2016-03-07 2863 > I then ran concatenation on that table. Rerunning the same query resulted > in: > > 2016-02-28 16866 > 2016-03-06 219 > 2016-03-07 1158 > > Note reduced count for 2016-03-07 > > I then ran concatenation a second time, and the query a third time: > 2016-02-28 16344 > 2016-03-06 219 > 2016-03-07 1158 > > Now the count for 2016-02-28 is reduced. > > This doesn't look like an elimination of duplicates occurring by design - > these didn't all happen on the first run of concatenation. It looks like > concatenation just kind of loses data. > > > > Want to work at Handy? Check out our culture deck and open roles > <http://www.handy.com/careers> > Latest news <http://www.handy.com/press> at Handy > Handy just raised $50m > <http://venturebeat.com/2015/11/02/on-demand-home-service-handy-raises-50m-in-round-led-by-fidelity/> > led > by Fidelity > >