Even so, as it stands, we have no "obligation" to make sure code you write that uses OfflineScanner will work across versions. Granted, we're not going to go and do it just to mess with you, but it's something that we can do to help you sleep better at night.

Ara Ebrahimi wrote:
Actually I was wrong. OfflineScanner is public, OfflineIterator is package. So 
it’s good enough :)

Ara.

On Feb 19, 2015, at 11:46 AM, Josh Elser<josh.el...@gmail.com>  wrote:

Want to file a ticket, Ara? I didn't realize it wasn't directly in the
public API (only via m/r). I think it would make a nice addition.

Ara Ebrahimi wrote:
OfflineScanner is package protected. So I'll need to hack it. If it
proves to be faster at least 20% then it's worth having it in the public
Ali, perhaps even let user use it by a asking specific file to be
scanned rather than directing scan by carefully defining the range to
touch the intended file.

Ara.

On Feb 19, 2015, at 8:15 AM, Keith Turner<ke...@deenlo.com
<mailto:ke...@deenlo.com>>  wrote:


On Thu, Feb 19, 2015 at 12:57 AM, Ara Ebrahimi
<ara.ebrah...@argyledata.com<mailto:ara.ebrah...@argyledata.com>>  wrote:

    Hi,

    I’m trying to optimize a connector we’ve written for Presto. In
    some cases we need to perform full table scans. This happens
    across all the nodes but each node is assigned to process only a
    sharded subset of data. Each shard is hosted by only 1 RFile. I’m
    looking at the AbstractInputFormat and OfflineIterator and it
    seems like the code is not that hard to use for this case. Is
    there any drawback? It seems like if the table is offline then
    OfflineIterator is used which apparently reads the RFiles directly
    and doesn’t involve any RPC and I think should be significantly
    faster. Is it so? Is there any drawback to using this while the
    table is not offline but no other app is messing with the table?


The code will throw an exception if the table is not offline (intent
is to ensure the files are stable and not garbage collected). As
others have stated you can clone.
Currently offline scanning is only supported in the public API w/ Map
Reduce. Curious, would you be interested in seeing this in the client
public API?


    Thanks,
    Ara.



    ________________________________

    This message is for the designated recipient only and may contain
    privileged, proprietary, or otherwise confidential information. If
    you have received it in error, please notify the sender
    immediately and delete the original. Any other use of the e-mail
    by you is prohibited. Thank you in advance for your cooperation.

    ________________________________





------------------------------------------------------------------------

This message is for the designated recipient only and may contain
privileged, proprietary, or otherwise confidential information. If you
have received it in error, please notify the sender immediately and
delete the original. Any other use of the e-mail by you is prohibited.
Thank you in advance for your cooperation.

------------------------------------------------------------------------


------------------------------------------------------------------------

This message is for the designated recipient only and may contain
privileged, proprietary, or otherwise confidential information. If you
have received it in error, please notify the sender immediately and
delete the original. Any other use of the e-mail by you is prohibited.
Thank you in advance for your cooperation.

------------------------------------------------------------------------


________________________________

This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited. Thank you in advance for your 
cooperation.

________________________________




________________________________

This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited. Thank you in advance for your 
cooperation.

________________________________

Reply via email to