Hi Calvin,

This actually did the trick and we it up and running now :).

On thing took me by complete surprise and that is the fact that the
directory structure is not reflected into S3 and instead Tachyon puts
everything into one folder and in numbered files.

That make me question two design decisions for Tachyon:

   - Make the client do worker/server stuff (connect directly to S3 in this
   case)
   - make deployment harder and make separation of responsibilities unclear
   (to say the least)

   - Store files in a proprietary Tachyon structure in the underlying
   filesystem (S3)
   - "/analytics/processed/tripcreator/events2/0_0_0.parquet" is stored
   like "/<bucket>/tmp/tachyon/data/115" (hope this is not permanent and that
   the file is moved)
   - If permanent it hinders use by any other clients than Tachyon

Please comment on the second point here and thank you for addressing the
first point in your email.

Regards,
 -Stefan



On Wed, Jul 29, 2015 at 5:50 PM, Calvin Jia <jia.cal...@gmail.com> wrote:

> Hi,
>
> I think the issue is in step 4, could you try adding
> the tachyon-underfs-s3 (0.7.0) jar as well as changing the jets3t version
> to 0.8.1 (this is the version Tachyon uses, does Drill require 0.9.3?).
>
> However, I think there may be other issues with that since Tachyon client
> may rely on other jars that are not available. One way around this is to
> compile Tachyon and use the tachyon-client-0.7.0-jar-with-dependencies
> (generated in tachyon/clients/client/target). But the first fix is probably
> worth a try since it shouldn't take much time.
>
> I think you hit on a very good point when you ask why does the Tachyon
> client require a connection to S3 and not just Tachyon. The current design
> for the client has under file system data operations (like writing
> s3n://streamanalytics/tmp/tachyon/workers/1438179000001/48) handled by
> the client to prevent a bottleneck at the worker. Its arguable that the
> Tachyon client should just delegate the work to the server so we can avoid
> having issues like this, but that will require some redesigning of the
> client.
>
> Hope this helps,
> Calvin
>

Reply via email to