thanks gromakowski and chin wei.
---Original---
From: "vincent gromakowski"
Date: 2017/5/23 00:54:33
To: "Chin Wei Low";
Cc: "user";"??"<1427357...@qq.com>;"Gene
Pang";
Subject: Re: Are tachyon and akka removed from 2.1.1 please
thanks Gene.
---Original---
From: "Gene Pang"
Date: 2017/5/22 22:19:47
To: "??"<1427357...@qq.com>;
Cc: "user";
Subject: Re: Are tachyon and akka removed from 2.1.1 please
Hi,
Tachyon has been renamed to Alluxio. Here is the documentation for runn
Akka has been replaced by netty in 1.6
Le 22 mai 2017 15:25, "Chin Wei Low" a écrit :
> I think akka has been removed since 2.0.
>
> On 22 May 2017 10:19 pm, "Gene Pang" wrote:
>
>> Hi,
>>
>> Tachyon has been renamed to Alluxio. Here is t
I think akka has been removed since 2.0.
On 22 May 2017 10:19 pm, "Gene Pang" wrote:
> Hi,
>
> Tachyon has been renamed to Alluxio. Here is the documentation for
> running Alluxio with Spark
> <http://www.alluxio.org/docs/master/en/Running-Spark-on-Alluxio.html&g
Hi,
Tachyon has been renamed to Alluxio. Here is the documentation for running
Alluxio with Spark
<http://www.alluxio.org/docs/master/en/Running-Spark-on-Alluxio.html>.
Hope this helps,
Gene
On Sun, May 21, 2017 at 6:15 PM, 萝卜丝炒饭 <1427357...@qq.com> wrote:
> HI all,
> Irea
HI all,
Iread some paper about source code, the paper base on version 1.2. they refer
the tachyon and akka. When i read the 2.1code. I can not find the code abiut
akka and tachyon.
Are tachyon and akka removed from 2.1.1 please
Hi,
If you are looking for how to run Spark on Alluxio (formerly Tachyon),
here is the documentation from Alluxio doc site:
http://www.alluxio.org/docs/master/en/Running-Spark-on-Alluxio.html
It still works for Spark 2.x.
Alluxio team also published articles on when and why running Spark (2.x
wrote:
> Here is my understanding.
>
> Spark used Tachyon as an off-heap solution for RDDs. In certain situations,
> it would alleviate Garbage Collection or the RDDs.
>
> Tungsten, Spark 2’s off-heap (columnar format) is much more efficient and
> used as the default. Alluvio no longe
Here is my understanding.
Spark used Tachyon as an off-heap solution for RDDs. In certain situations, it
would alleviate Garbage Collection or the RDDs.
Tungsten, Spark 2’s off-heap (columnar format) is much more efficient and used
as the default. Alluvio no longer makes sense for this use
Hi folks,
What has happened with Tachyon / Alluxio in Spark 2? Doc doesn't mention it
no longer.
--
Oleksiy Dyagilev
Hi, Calvin, I am running 24GB data Spark KMeans in a c3.2xlarge AWS
instance with 30GB physical memory.
Spark will cache data off-heap to Tachyon, the input data is also stored in
Tachyon.
Tachyon is configured to use 15GB memory, and use tired store.
Tachyon underFS is /tmp.
The only
Hey, Jia Zou
I'm curious about this exception, the error log you showed that the
exception is related to unlockBlock, could you upload your full master.log
and worker.log under tachyon/logs directory?
Best,
Cheng
在 2016年1月29日星期五 UTC+8上午11:11:19,Calvin Jia写道:
>
> Hi,
>
&g
Hi,
Thanks for the detailed information. How large is the dataset you are
running against? Also did you change any Tachyon configurations?
Thanks,
Calvin
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For
)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
On Wed, Jan 27, 2016 at 5:53 AM, Jia Zou wrote:
> BTW. The tachyon worker log says following:
>
>
>
> 2015-12-
BTW. The tachyon worker log says following:
2015-12-27 01:33:44,599 ERROR WORKER_LOGGER
(WorkerBlockMasterClient.java:getId) - java.net.SocketException: Connection
reset
org.apache.thrift.transport.TTransportException: java.net.SocketException:
Connection reset
BTW. the error happens when configure Spark to read input file from Tachyon
like following:
/home/ubuntu/spark-1.6.0/bin/spark-submit --properties-file
/home/ubuntu/HiBench/report/kmeans/spark/java/conf/sparkbench/spark.conf
--class org.apache.spark.examples.mllib.JavaKMeans --master spark://ip
Dears, I keep getting below exception when using Spark 1.6.0 on top of
Tachyon 0.8.2. Tachyon is 93% used and configured as CACHE_THROUGH.
Any suggestions will be appreciated, thanks!
=
Exception in thread "main" org.apache.spark.SparkException: J
Hi,
You should be able to point Hive to Tachyon instead of HDFS, and that
should allow Hive to access data in Tachyon. If Spark SQL was pointing to
an HDFS file, you could instead point it to a Tachyon file, and that should
work too.
Hope that helps,
Gene
On Wed, Jan 20, 2016 at 2:06 AM, Sea
Hi,all I want to mount some hive table in tachyon, but I don't know how to
query data in tachyon with spark-sql, who knows?
Hi Mark,
Were you able to successfully store the RDD with Akhil's method? When you
read it back as an objectFile, you will also need to specify the correct
type.
You can find more information about integrating Spark and Tachyon on this
page: http://tachyon-project.org/documentation/Running-
Hi,all
Well, I have some questions about Tachyon and Spark. I found the
interactive between Spark and Tachyon is the caching RDD use off-heap. I
wonder if you guys use Tachyon frequently, such caching RDD by Tachyon? Is
this action(caching rdd by tachyon) has a profound effect to accelerate
Spark
I guess you can do a .saveAsObjectFiles and read it back as sc.objectFile
Thanks
Best Regards
On Fri, Oct 23, 2015 at 7:57 AM, mark wrote:
> I have Avro records stored in Parquet files in HDFS. I want to read these
> out as an RDD and save that RDD in Tachyon for any spark job that wan
Hi Shane,
Tachyon provides an api to get the block locations of the file which Spark
uses when scheduling tasks.
Hope this helps,
Calvin
On Fri, Oct 23, 2015 at 8:15 AM, Kinsella, Shane
wrote:
> Hi all,
>
>
>
> I am looking into how Spark handles data locality wrt Tachyon. My
Hi all,
I am looking into how Spark handles data locality wrt Tachyon. My main concern
is how this is coordinated. Will it send a task based on a file loaded from
Tachyon to a node that it knows has that file locally and how does it know
which nodes has what?
Kind regards,
Shane
This email
I have Avro records stored in Parquet files in HDFS. I want to read these
out as an RDD and save that RDD in Tachyon for any spark job that wants the
data.
How do I save the RDD in Tachyon? What format do I use? Which RDD
'saveAs...' method do I want?
Thanks
not need to
>> enable Data Check pointing.
>>
>> From my experiments and the PR I mentioned , I configured the Meta Data
>> Check Pointing in HDFS , and stored the Received Blocks OFF_HEAP. And I
>> did not use any WAL . The PR I proposed would recover from Dr
gt; without using any WAL like feature because Blocks are already available in
> Tachyon. The Meta Data Checkpoint helps to recover the meta data about past
> received blocks.
>
> Now the question is , can I configure Tachyon as my Metadata Checkpoint
> location ? I tried that , and Stre
not use any WAL . The PR I proposed would recover from Driver fail-over
without using any WAL like feature because Blocks are already available in
Tachyon. The Meta Data Checkpoint helps to recover the meta data about past
received blocks.
Now the question is , can I configure Tachyon as my Metad
Hi Dibyendu,
I am not sure I understand completely. But are you suggesting that
currently there is no way to enable Checkpoint directory to be in Tachyon?
Thanks
Nikunj
On Fri, Sep 25, 2015 at 11:49 PM, Dibyendu Bhattacharya <
dibyendu.bhattach...@gmail.com> wrote:
> Hi,
>
>
Hi,
Recently I was working on a PR to use Tachyon as OFF_HEAP store for Spark
Streaming and make sure Spark Streaming can recover from Driver failure and
recover the blocks form Tachyon.
The The Motivation for this PR is :
If Streaming application stores the blocks OFF_HEAP, it may not need
Hi Dibyendu,
How does one go about configuring spark streaming to use tachyon as its
place for storing checkpoints? Also, can one do this with tachyon running
on a completely different node than where spark processes are running?
Thanks
Nikunj
On Thu, May 21, 2015 at 8:35 PM, Dibyendu
The URL seems to have changed .. here is the one ..
http://tachyon-project.org/documentation/Tiered-Storage-on-Tachyon.html
On Wed, Aug 26, 2015 at 12:32 PM, Dibyendu Bhattacharya <
dibyendu.bhattach...@gmail.com> wrote:
> Sometime back I was playing with Spark and Tachyon and I a
Sometime back I was playing with Spark and Tachyon and I also found this
issue . The issue here is TachyonBlockManager put the blocks in
WriteType.TRY_CACHE configuration . And because of this Blocks ate evicted
from Tachyon Cache when Memory is full and when Spark try to find the block
it throws
I am using tachyon in the spark program below,but I encounter a
BlockNotFoundxception.
Does someone know what's wrong and also is there guide on how to configure
spark to work with Tackyon?Thanks!
conf.set("spark.externalBlockStore.url", "tachyon://10.18.19.33:
Thanks Calvin - much appreciated !
-Abhishek-
On Aug 7, 2015, at 11:11 AM, Calvin Jia wrote:
> Hi Abhishek,
>
> Here's a production use case that may interest you:
> http://www.meetup.com/Tachyon/events/222485713/
>
> Baidu is using Tachyon to manage more than
Exactly!
The sharing part is used in the Spark Notebook (this one
<https://github.com/andypetrella/spark-notebook/blob/master/notebooks/Tachyon%20Test.snb>)
so we can share stuffs between notebooks which are different SparkContext
(in diff JVM).
OTOH, we have a project that creates
Hi,
Tachyon <http://tachyon-project.org> manages memory off heap which can help
prevent long GC pauses. Also, using Tachyon will allow the data to be
shared between Spark jobs if they use the same dataset.
Here's <http://www.meetup.com/Tachyon/events/222485713/> a production use
Hi Abhishek,
Here's a production use case that may interest you:
http://www.meetup.com/Tachyon/events/222485713/
<http://www.google.com/url?q=http%3A%2F%2Fwww.meetup.com%2FTachyon%2Fevents%2F222485713%2F&sa=D&sntz=1&usg=AFQjCNF3CUpEO3tNzziAS-FoYzqINijNCA>
Baidu is using
Looks like you would get better response on Tachyon's mailing list:
https://groups.google.com/forum/?fromgroups#!forum/tachyon-users
Cheers
On Fri, Aug 7, 2015 at 9:56 AM, Abhishek R. Singh <
abhis...@tetrationanalytics.com> wrote:
> Do people use Tachyon in production, or is i
Do people use Tachyon in production, or is it experimental grade still?
Regards,
Abhishek
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
Spark is an in-memory engine and attempts to do computation in-memory.
Tachyon is memory-centeric distributed storage, OK, but how would that help
ran Spark faster?
Hi June,
As i understand your problem, you are running spark 1.3 and want to use
Tachyon with it. what you need to do is simply build the latest Spark and
Tachyon and set some configuration is Spark. In fact spark 1.3 has
"spark/core/pom.xm", you have to find the "core" folde
Hi Tathagata,
Thanks for looking into this. Further investigating I found that the issue
is with Tachyon does not support File Append. The streaming receiver which
writes to WAL when failed, and again restarted, not able to append to same
WAL file after restart.
I raised this with Tachyon user
some fault tolerant testing of Spark Streaming with Tachyon as
> OFF_HEAP block store. As I said in earlier email, I could able to solve the
> BlockNotFound exception when I used Hierarchical Storage of Tachyon ,
> which is good.
>
> I continue doing some testing around storing the S
Dear all,
We’re organizing a meetup <http://www.meetup.com/Tachyon/events/222485713/> on
May 28th at IBM in Forster City that might be of interest to the Spark
community. The focus is a production use case of Spark and Tachyon at Baidu.
You can sign up here: http://www.meetup.com/Tachyon/
Hi,
You can apply this patch <https://github.com/apache/spark/pull/5354> and
recompile.
Hope this helps,
Calvin
On Tue, Apr 28, 2015 at 1:19 PM, sara mustafa
wrote:
> Hi Zhang,
>
> How did you compile Spark 1.3.1 with Tachyon? when i changed Tachyon
> version
> to 0.6.3
Daniel,
Instead of using localhost:19998, you may want to use the real ip address
TachyonMaster is configured. You should be able to see more info in
Tachyon's UI as well. More info could be found here:
http://tachyon-project.org/master/Running-Tachyon-on-EC2.html
Best,
Haoyuan
On Fri, A
I have a cluster launched with spark-ec2.
I can see a TachyonMaster process running,
but I do not seem to be able to use tachyon from the spark-shell.
if I try
rdd.saveAsTextFile("tachyon://localhost:19998/path")
I get
15/04/24 19:18:31 INFO TaskSetManager: Starting task 12.2 in stag
ngfei
wrote:
> Hi,
> I did some tests on Parquet Files with Spark SQL DataFrame API.
> I generated 36 gzip compressed parquet files by Spark SQL and stored them
> on Tachyon,The size of each file is about 222M.Then read them with below
> code.
> val tfs
> =sqlContext.parque
Would you mind to open a JIRA for this?
I think your suspicion makes sense. Will have a look at this tomorrow.
Thanks for reporting!
Cheng
On 4/13/15 7:13 PM, zhangxiongfei wrote:
Hi experts
I run below code in Spark Shell to access parquet files in Tachyon.
1.First,created a DataFrame by
Response inline.
On Tue, Mar 31, 2015 at 10:41 PM, Sean Bigdatafun wrote:
> (resending...)
>
> I was thinking the same setup… But the more I think of this problem, and
> the more interesting this could be.
>
> If we allocate 50% total memory to Tachyon statically, then the M
(resending...)
I was thinking the same setup… But the more I think of this problem, and
the more interesting this could be.
If we allocate 50% total memory to Tachyon statically, then the Mesos
benefits of dynamically scheduling resources go away altogether.
Can Tachyon be resource managed by
Ankur,
Response inline.
On Tue, Mar 31, 2015 at 4:49 PM, Ankur Chauhan
wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Hi Haoyuan,
>
> So on each mesos slave node I should allocate/section off some amount
> of memory for tachyon (let's say 50% of the
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hi Haoyuan,
So on each mesos slave node I should allocate/section off some amount
of memory for tachyon (let's say 50% of the total memory) and the rest
for regular mesos tasks?
This means, on each slave node I would have tachyon worker (+
Tachyon should be co-located with Spark in this case.
Best,
Haoyuan
On Tue, Mar 31, 2015 at 4:30 PM, Ankur Chauhan
wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Hi,
>
> I am fairly new to the spark ecosystem and I have been trying to setup
> a spark on meso
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hi,
I am fairly new to the spark ecosystem and I have been trying to setup
a spark on mesos deployment. I can't seem to figure out the "best
practices" around HDFS and Tachyon. The documentation about Spark's
data-locality section
You are hitting https://issues.apache.org/jira/browse/SPARK-6330. It has
been fixed in 1.3.1, which will be released soon.
On Fri, Mar 27, 2015 at 10:42 PM, sud_self <852677...@qq.com> wrote:
> spark version is 1.3.0 with tanhyon-0.6.1
>
> QUESTION DESCRIPTION: rdd.saveAsObje
spark version is 1.3.0 with tanhyon-0.6.1
QUESTION DESCRIPTION: rdd.saveAsObjectFile("tachyon://host:19998/test") and
rdd.saveAsTextFile("tachyon://host:19998/test") succeed, but
rdd.toDF().saveAsParquetFile("tachyon://host:19998/test&
Thanks haoyuan.
fightf...@163.com
From: Haoyuan Li
Date: 2015-03-16 12:59
To: fightf...@163.com
CC: Shao, Saisai; user
Subject: Re: RE: Building spark over specified tachyon
Here is a patch: https://github.com/apache/spark/pull/4867
On Sun, Mar 15, 2015 at 8:46 PM, fightf...@163.com wrote
Here is a patch: https://github.com/apache/spark/pull/4867
On Sun, Mar 15, 2015 at 8:46 PM, fightf...@163.com
wrote:
> Thanks, Jerry
> I got that way. Just to make sure whether there can be some option to
> directly
> specifying tachyon version.
>
>
> --
Thanks, Jerry
I got that way. Just to make sure whether there can be some option to directly
specifying tachyon version.
fightf...@163.com
From: Shao, Saisai
Date: 2015-03-16 11:10
To: fightf...@163.com
CC: user
Subject: RE: Building spark over specified tachyon
I think you could change the
I think you could change the pom file under Spark project to update the Tachyon
related dependency version and rebuild it again (in case API is compatible, and
behavior is the same).
I'm not sure is there any command you can use to compile against Tachyon
version.
Thanks
Jerry
From: f
Hi, all
Noting that the current spark releases are built-in with tachyon 0.5.0 ,
if we want to recompile spark with maven and targeting on specific tachyon
version (let's say the most recent 0.6.0 release),
how should that be done? What maven compile command should be like ?
Thanks
Did you try something like:
myRDD.saveAsObjectFile("tachyon://localhost:19998/Y")
val newRDD = sc.objectFile[MyObject]("tachyon://localhost:19998/Y")
Thanks
Best Regards
On Sun, Mar 8, 2015 at 3:59 PM, Yijie Shen
wrote:
> Hi,
>
> I would like to share a RDD
Hi,
I would like to share a RDD in several Spark Applications,
i.e, create one in application A, publish the ID somewhere and get the RDD back
directly using ID in Application B.
I know I can use Tachyon just as a filesystem and
s.saveAsTextFile("tachyon://localhost:19998/Y”) like this.
Hi all,
Is it possible to store Spark shuffled files on a distributed memory like
Tachyon instead of spilling them to disk?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Store-the-shuffled-files-in-memory-using-Tachyon-tp21944.html
Sent from the Apache
Agreed with Jerry. Aside from Tachyon, seeing this for general debugging
would be very helpful.
Haoyuan, is that feature you are referring to related to
https://issues.apache.org/jira/browse/SPARK-975?
In the interim, I've found the "toDebugString()" method useful (but it
renders
Jerry,
Great question. Spark and Tachyon capture lineage information at different
granularities. We are working on an integration between Spark/Tachyon about
this. Hope to get it ready to be released soon.
Best,
Haoyuan
On Fri, Jan 2, 2015 at 12:24 PM, Jerry Lam wrote:
> Hi spark develop
Hi spark developers,
I was thinking it would be nice to extract the data lineage information
from a data processing pipeline. I assume that spark/tachyon keeps this
information somewhere. For instance, a data processing pipeline uses
datasource A and B to produce C. C is then used by another
bumping this thread up
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/UpdateStateByKey-persist-to-Tachyon-tp20798p20930.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
IMHO: cache doesn't provide redundancy, and its in the same jvm, so its much
faster.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-Tachyon-tp1463p20800.html
Sent from the Apache Spark User List mailing list archive at Nabbl
myclus
All seem to run OK. However, I got no web UI's for spark master or slave.
Logging into the nodes, I see HDFS and Tachyon processes but none for Spark.
The /root/tachyon folder has a full complement of files including conf,
logs and so forth:
$ ls /root/tachyon
bin docs libexec
StorageLevel.OFF_HEAP requires to run Tachyon:
http://spark.apache.org/docs/latest/programming-guide.html
If you don't know if you have tachyon or not, you probably don't :)
http://tachyon-project.org/
For local testing, you can use other persist() solutions without running
Tach
related to Tachyon. But I don't know if
I have tachyon or not.
14/11/21 14:17:54 WARN storage.TachyonBlockManager: Attempt 1 to create
tachyon dir null failed
java.io.IOException: Failed to connect to master localhost/127.0.0.1:19998
after 5 attempts
at tachyon.client.TachyonFS.co
Is it possible to store spark shuffle files on Tachyon ?
-- Forwarded message --
From: Haoyuan Li
Date: Thu, Oct 2, 2014 at 10:12 AM
Subject: Second Bay Area Tachyon meetup: October 21st, hosted by Pivotal
(Limited Space)
To: tachyon-us...@googlegroups.com
Hi folks,
We've posted the second Tachyon meetup featuring exciting up
Hi,
Did you manage to figure this out? I would appreciate if you could share the
answer.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/spark-ec2-script-with-Tachyon-tp9996p15249.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
And it worked earlier with non-parquet directory.
On Thu, Aug 21, 2014 at 12:22 PM, Evan Chan wrote:
> The underFS is HDFS btw.
>
> On Thu, Aug 21, 2014 at 12:22 PM, Evan Chan wrote:
>> Spark 1.0.2, Tachyon 0.4.1, Hadoop 1.0 (standard EC2 config)
>>
The underFS is HDFS btw.
On Thu, Aug 21, 2014 at 12:22 PM, Evan Chan wrote:
> Spark 1.0.2, Tachyon 0.4.1, Hadoop 1.0 (standard EC2 config)
>
> scala> val gdeltT =
> sqlContext.parquetFile("tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005/")
> 14/08/21 19:07:14
Spark 1.0.2, Tachyon 0.4.1, Hadoop 1.0 (standard EC2 config)
scala> val gdeltT =
sqlContext.parquetFile("tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005/")
14/08/21 19:07:14 INFO :
initialize(tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005,
Configuration: core-defa
Fantastic!
Sent while mobile. Pls excuse typos etc.
On Aug 19, 2014 4:09 PM, "Haoyuan Li" wrote:
> Hi folks,
>
> We've posted the first Tachyon meetup, which will be on August 25th and is
> hosted by Yahoo! (Limited Space):
> http://www.meetup.com/Tachyon/event
Hi folks,
We've posted the first Tachyon meetup, which will be on August 25th and is
hosted by Yahoo! (Limited Space):
http://www.meetup.com/Tachyon/events/200387252/ . Hope to see you there!
Best,
Haoyuan
--
Haoyuan Li
AMPLab, EECS, UC Berkeley
http://www.cs.berkeley.edu/~haoyuan/
more interesting is if spark-shell started on master node (test01)
then
parquetFile.saveAsParquetFile("tachyon://test01.zala:19998/parquet_tablex")
14/08/12 11:42:06 INFO : initialize(tachyon://...
...
...
14/08/12 11:42:06 INFO : File does not exist:
tachyon://test01.zala:19998/parq
spark.speculation was not set, any speculative execution on tachyon side?
tachyon-env.sh only changed following
export TACHYON_MASTER_ADDRESS=test01.zala
#export TACHYON_UNDERFS_ADDRESS=$TACHYON_HOME/underfs
export TACHYON_UNDERFS_ADDRESS=hdfs://test01.zala:8020
export TACHYON_WORKER_MEMORY_SIZE
Is the speculative execution enabled?
Best,
Haoyuan
On Mon, Aug 11, 2014 at 8:08 AM, chutium wrote:
> sharing /reusing RDDs is always useful for many use cases, is this possible
> via persisting RDD on tachyon?
>
> such as off heap persist a named RDD into a given path
sharing /reusing RDDs is always useful for many use cases, is this possible
via persisting RDD on tachyon?
such as off heap persist a named RDD into a given path (instead of
/tmp_spark_tachyon/spark-xxx-xxx-xxx)
or
saveAsParquetFile on tachyon
i tried to save a SchemaRDD on tachyon,
val
We are investigating various ways to integrate with Tachyon. I'll note
that you can already use saveAsParquetFile and
parquetFile(...).registerAsTable("tableName") (soon to be registerTempTable
in Spark 1.1) to store data into tachyon and query it with Spark SQL.
On Fri, Aug 1,
Hi, I would like to ask if spark-sql tables cached by Tachyon is a
feature to be migrated from shark.
I imagine from the user perspective it would look like this:
|CREATE TABLE data TBLPROPERTIES("sparksql.cache" = "tachyon") AS SELECT a, b, c from
data_on_disk WHERE month="May";|
Hi,
It seems that spark-ec2 script deploys Tachyon module along with other
setup.
I am trying to use .persist(OFF_HEAP) for RDD persistence, but on worker I
see this error
--
Failed to connect (2) to master localhost/127.0.0.1:19998 :
java.net.ConnectException: Connection refused
--
>F
More updates:
Seems in TachyonBlockManager.scala(line 118) of Spark 1.1.0, the
TachyonFS.mkdir() method is called, which creates a directory in Tachyon.
Right after that, TachyonFS.getFile() method is called. In all the versions
of Tachyon I tried (0.4.1, 0.4.0), the second method will return a
Hi guys,
I'm running Spark 1.0.0 with Tachyon 0.4.1, both in single node mode.
Tachyon's own tests (./bin/tachyon runTests) works good, and manual file
system operation like mkdir works well. But when I tried to run a very
simple Spark task with RDD persist as OFF_HEAP, I got the
91 matches
Mail list logo