I see that hive has away to take a table and produce multiple rows.

Is there a built in way to do the reverse?

Say I have a table with a unique key and an array. I do this:

> insert into explodedTable select uniqueId, explode(arrayOfThings) from
originalTable

Now I have a table with a row for each (uniqueId, element in arrayOfThings).

Is there any way to take the contents of explodedTable and essentially
produce the original table, reconstructing the arrayOfThings for each
uniqueId?

It seems, conceptually, that if I "cluster by uniqueId" then a reducer
knows that it will get all rows for each uniqueId bundled together, so it
ought to be fairly feasible to simply emit an unexploded row. However, I
can't seem to find a built-in way to do this.

Mike

Reply via email to