> The first thing that strikes me is that createReader takes a file.  
> But for acid, you need to pass the directory because it needs to look for any 
> relevant delta files.  

The ACID 2.x impl, the InputFormat gets a directory - but a Reader should still 
be getting an individual file.

In fact, it should be getting something smaller than a File (ideally an HDFS 
block) which is encoded as FileSplit (path, offset, len) and a ValidTxnList.

In the original ACID impl, multiple delta files get a single Reader, while in 
the new ACID 2.x impl the concept of a "base file + deltas" is irrelevant. 

All insert deltas are equivalent base files - the base concept is (1 Stripe) + 
(Relevant deletes) == 1 vector reader.

There's no need to know which delete deltas were already read unlike the UPDATE 
ops (i.e Split #1 and Split #2 can load their deletes independently, without 
worrying about double row outputs).

If a delete delta which was loaded is not found in the input split, it has no 
effect on the reader's output correctness.

> I don’t like that the user has to make a different call in the acid case.
 
You need to identify within ORC whether the file provided is a base file or a 
delta insert file.

If the file is a base_xx, then all deletes exist in ./delete_deltas_*/

if the file is named delta_xx, then all deletes exist in ../delete_deltas*/

Those are strictly enforced by the ACID implementation & can serve as easy 
assumptions.

Other than that, a non-null ValidTxnList is all it should take.

>   the user will already have to have split logic.

The part where the logic-splits off is into InputFormat - detecting compaction 
during split generation is strictly the InputFormat's problem.

There's a bit of magic there which is in plain sight, like how the INSERT 
OVERWRITE works transactionally (HIVE-14988).

For me, the clear division is to look at this problem as "Details about file 
names" (includes  HIVE-14535) and "Details about a Stripe" (Reader + valid-txns 
+ deletes application).

Everything in the middle is just the same as regular ORC, like PPD.

>From the other side of the mirror, the flat ORC API is pretty much a Null 
>ROW__ID pruned already, with 0 deletes and Long.MAX watermark in the 
>ReaderOptions.

> implementation of Reader and RecordReader that understand acid

There's an "*" to most of the above - a reader which intends to modify the data 
might need a different API, to be explicit that the ROW__ID is projected out  
to the user.

Cheers,
Gopal



On 9/13/17, 3:48 PM, "Alan Gates" <alanfga...@gmail.com> wrote:

    I’ve been looking at the OrcFile.createReader method and thinking about
    what I will need to do to read acid files.  The first thing that strikes me
    is that createReader takes a file.  But for acid, you need to pass the
    directory because it needs to look for any relevant delta files.  Acid also
    requires a ValidTxnList.  We can add that to the ReaderOptions.
    
    It seems the best way to do this is to add a new method
    OrcFile.createAcidReader that takes a directory.  I don’t like that the
    user has to make a different call in the acid case.  But the user will have
    to set the ValidTxnList in the reader options anyway, so the user will
    already have to have split logic.
    
    Every way I could think of for createReader to decide if it was dealing
    with an acid directory or a non-acid file seemed to create jumbled
    semantics.
    Does the user pass a directory for the acid case but a file for non-acid?
    Yuck.
    Does the user pass a base file in the acid case and the code walks up the
    path to find the relevant directory?  Seems error prone and slow.
    
    Related to this is my assumption that I will need to write a new
    implementation of Reader and RecordReader that understand acid.  This seems
    better than putting a bunch of branches into the existing code to try to
    handle both cases.
    
    Thoughts?
    
    Alan.
    


Reply via email to