Re: on shark, is tachyon less efficient than memory_only cache strategy ?

2014-07-28 Thread qingyang li
hi, haoyuan, thanks for replying.


2014-07-21 16:29 GMT+08:00 Haoyuan Li haoyuan...@gmail.com:

 Qingyang,

 Aha. Got it.

 800MB data is pretty small. Loading from Tachyon does have a bit of extra
 overhead. But it will have more benefit when the data size is larger. Also,
 if you store the table in Tachyon, you can have different shark servers to
 query the data at the same time. For more trade-off, please refer to this
 page: http://tachyon-project.org/Running-Shark-on-Tachyon.html

 Best,

 Haoyuan


 On Wed, Jul 16, 2014 at 12:06 AM, qingyang li liqingyang1...@gmail.com
 wrote:

  let's me describe my scene:
  --
  i have 8 machines (24 core , 16G memory, per machine) of spark cluster
 and
  tachyon cluster.  On tachyon,  I create one table which contains 800M
 data,
  when i run query sql on shark,   it will cost 2.43s,  but when i create
 the
  same table on spark memory , i run  the same sql , it will cost 1.56s.
   data on tachyon cost more time than data on spark memory.   they all
 have
  150 map process,  and per node 16-20 map process.
  I think the reason is that when data is on tachyon, shark will let spark
  slave load data from tachyon salve which is on the same node with tachyon
  slave,
  i have tried to set some configuration to tune shark and tachyon, but
 still
  can not make the former more fast than 2.43s.
  do anyone have some ideas ?
 
  By the way ,  my tachyon block size is 1GB now,  i want to reset block
 size
  ,  will it work by setting tachyon.user.default.block.size.byte=8M ?  if
  not,  what does tachyon.user.default.block.size.byte mean?
 
 
  2014-07-14 13:13 GMT+08:00 qingyang li liqingyang1...@gmail.com:
 
   Shark,  thanks for replying.
   Let's me clear my question again.
   --
   i create a table using  create table xxx1
   tblproperties(shark.cache=tachyon) as select * from xxx2
   when excuting some sql (for example , select * from xxx1) using shark,
shark will read data into shark's memory  from tachyon's memory.
   I think if each time we execute sql, shark always load data from
 tachyon,
   it is less effient.
   could we use some cache policy (such as,  CacheAllPolicy
 FIFOCachePolicy
   LRUCachePolicy ) to cache data to invoid reading data from tachyon for
   each sql query?
   --
  
  
  
   2014-07-14 2:47 GMT+08:00 Haoyuan Li haoyuan...@gmail.com:
  
   Qingyang,
  
   Are you asking Spark or Shark (The first email was Shark, the last
  email
   was Spark.)?
  
   Best,
  
   Haoyuan
  
  
   On Wed, Jul 9, 2014 at 7:40 PM, qingyang li liqingyang1...@gmail.com
 
   wrote:
  
could i set some cache policy to let spark load data from tachyon
 only
   one
time for all sql query?  for example by using CacheAllPolicy
FIFOCachePolicy LRUCachePolicy.  But I have tried that three policy,
   they
are not useful.
I think , if spark always load data for each sql query,  it will
  impact
   the
query speed , it will take more time than the case that data are
   managed by
spark itself.
   
   
   
   
2014-07-09 1:19 GMT+08:00 Haoyuan Li haoyuan...@gmail.com:
   
 Yes. For Shark, two modes, shark.cache=tachyon and
shark.cache=memory,
 have the same ser/de overhead. Shark loads data from outsize of
 the
process
 in Tachyon mode with the following benefits:


- In-memory data sharing across multiple Shark instances (i.e.
stronger
isolation)
- Instant recovery of in-memory tables
- Reduce heap size = faster GC in shark
- If the table is larger than the memory size, only the hot
  columns
will
be cached in memory

 from
  http://tachyon-project.org/master/Running-Shark-on-Tachyon.html
   and
 https://github.com/amplab/shark/wiki/Running-Shark-with-Tachyon

 Haoyuan


 On Tue, Jul 8, 2014 at 9:58 AM, Aaron Davidson 
 ilike...@gmail.com
wrote:

  Shark's in-memory format is already serialized (it's compressed
  and
  column-based).
 
 
  On Tue, Jul 8, 2014 at 9:50 AM, Mridul Muralidharan 
   mri...@gmail.com
  wrote:
 
   You are ignoring serde costs :-)
  
   - Mridul
  
   On Tue, Jul 8, 2014 at 8:48 PM, Aaron Davidson 
   ilike...@gmail.com
  wrote:
Tachyon should only be marginally less performant than
   memory_only,
   because
we mmap the data from Tachyon's ramdisk. We do not have to,
  say,
  transfer
the data over a pipe from Tachyon; we can directly read from
  the
  buffers
   in
the same way that Shark reads from its in-memory columnar
   format.
   
   
   
On Tue, Jul 8, 2014 at 1:18 AM, qingyang li 
 liqingyang1...@gmail.com
wrote:
   
hi, when i create a table, i can point the cache strategy
  using
shark.cache,
i think shark.cache

Re: on shark, is tachyon less efficient than memory_only cache strategy ?

2014-07-13 Thread qingyang li
Shark,  thanks for replying.
Let's me clear my question again.
--
i create a table using  create table xxx1
tblproperties(shark.cache=tachyon) as select * from xxx2
when excuting some sql (for example , select * from xxx1) using shark,
 shark will read data into shark's memory  from tachyon's memory.
I think if each time we execute sql, shark always load data from tachyon,
it is less effient.
could we use some cache policy (such as,  CacheAllPolicy FIFOCachePolicy
LRUCachePolicy ) to cache data to invoid reading data from tachyon for each
sql query?
--



2014-07-14 2:47 GMT+08:00 Haoyuan Li haoyuan...@gmail.com:

 Qingyang,

 Are you asking Spark or Shark (The first email was Shark, the last email
 was Spark.)?

 Best,

 Haoyuan


 On Wed, Jul 9, 2014 at 7:40 PM, qingyang li liqingyang1...@gmail.com
 wrote:

  could i set some cache policy to let spark load data from tachyon only
 one
  time for all sql query?  for example by using CacheAllPolicy
  FIFOCachePolicy LRUCachePolicy.  But I have tried that three policy, they
  are not useful.
  I think , if spark always load data for each sql query,  it will impact
 the
  query speed , it will take more time than the case that data are managed
 by
  spark itself.
 
 
 
 
  2014-07-09 1:19 GMT+08:00 Haoyuan Li haoyuan...@gmail.com:
 
   Yes. For Shark, two modes, shark.cache=tachyon and
  shark.cache=memory,
   have the same ser/de overhead. Shark loads data from outsize of the
  process
   in Tachyon mode with the following benefits:
  
  
  - In-memory data sharing across multiple Shark instances (i.e.
  stronger
  isolation)
  - Instant recovery of in-memory tables
  - Reduce heap size = faster GC in shark
  - If the table is larger than the memory size, only the hot columns
  will
  be cached in memory
  
   from http://tachyon-project.org/master/Running-Shark-on-Tachyon.html
 and
   https://github.com/amplab/shark/wiki/Running-Shark-with-Tachyon
  
   Haoyuan
  
  
   On Tue, Jul 8, 2014 at 9:58 AM, Aaron Davidson ilike...@gmail.com
  wrote:
  
Shark's in-memory format is already serialized (it's compressed and
column-based).
   
   
On Tue, Jul 8, 2014 at 9:50 AM, Mridul Muralidharan 
 mri...@gmail.com
wrote:
   
 You are ignoring serde costs :-)

 - Mridul

 On Tue, Jul 8, 2014 at 8:48 PM, Aaron Davidson ilike...@gmail.com
 
wrote:
  Tachyon should only be marginally less performant than
 memory_only,
 because
  we mmap the data from Tachyon's ramdisk. We do not have to, say,
transfer
  the data over a pipe from Tachyon; we can directly read from the
buffers
 in
  the same way that Shark reads from its in-memory columnar format.
 
 
 
  On Tue, Jul 8, 2014 at 1:18 AM, qingyang li 
   liqingyang1...@gmail.com
  wrote:
 
  hi, when i create a table, i can point the cache strategy using
  shark.cache,
  i think shark.cache=memory_only  means data are managed by
  spark,
and
  data are in the same jvm with excutor;   while
shark.cache=tachyon
   means  data are managed by tachyon which is off heap, and data
  are
not
 in
  the same jvm with excutor,  so spark will load data from tachyon
  for
 each
  query sql , so,  is  tachyon less efficient than memory_only
 cache
 strategy
   ?
  if yes, can we let spark load all data once from tachyon  for
 all
   sql
 query
   if i want to use tachyon cache strategy since tachyon is more
 HA
   than
  memory_only ?
 

   
  
  
  
   --
   Haoyuan Li
   AMPLab, EECS, UC Berkeley
   http://www.cs.berkeley.edu/~haoyuan/
  
 



 --
 Haoyuan Li
 AMPLab, EECS, UC Berkeley
 http://www.cs.berkeley.edu/~haoyuan/



when insert data into one table which is on tachyon, how can i control the data position?

2014-07-10 Thread qingyang li
when insert data (the data is small, it will not be partitioned
automatically)into one table which is on tachyon, how can i control the
data position,  i mean how can i point which machine the data should exist
on?
if we can not control, what is the data assign strategy of tachyon or spark?


Re: on shark, is tachyon less efficient than memory_only cache strategy ?

2014-07-09 Thread qingyang li
could i set some cache policy to let spark load data from tachyon only one
time for all sql query?  for example by using CacheAllPolicy
FIFOCachePolicy LRUCachePolicy.  But I have tried that three policy, they
are not useful.
I think , if spark always load data for each sql query,  it will impact the
query speed , it will take more time than the case that data are managed by
spark itself.




2014-07-09 1:19 GMT+08:00 Haoyuan Li haoyuan...@gmail.com:

 Yes. For Shark, two modes, shark.cache=tachyon and shark.cache=memory,
 have the same ser/de overhead. Shark loads data from outsize of the process
 in Tachyon mode with the following benefits:


- In-memory data sharing across multiple Shark instances (i.e. stronger
isolation)
- Instant recovery of in-memory tables
- Reduce heap size = faster GC in shark
- If the table is larger than the memory size, only the hot columns will
be cached in memory

 from http://tachyon-project.org/master/Running-Shark-on-Tachyon.html and
 https://github.com/amplab/shark/wiki/Running-Shark-with-Tachyon

 Haoyuan


 On Tue, Jul 8, 2014 at 9:58 AM, Aaron Davidson ilike...@gmail.com wrote:

  Shark's in-memory format is already serialized (it's compressed and
  column-based).
 
 
  On Tue, Jul 8, 2014 at 9:50 AM, Mridul Muralidharan mri...@gmail.com
  wrote:
 
   You are ignoring serde costs :-)
  
   - Mridul
  
   On Tue, Jul 8, 2014 at 8:48 PM, Aaron Davidson ilike...@gmail.com
  wrote:
Tachyon should only be marginally less performant than memory_only,
   because
we mmap the data from Tachyon's ramdisk. We do not have to, say,
  transfer
the data over a pipe from Tachyon; we can directly read from the
  buffers
   in
the same way that Shark reads from its in-memory columnar format.
   
   
   
On Tue, Jul 8, 2014 at 1:18 AM, qingyang li 
 liqingyang1...@gmail.com
wrote:
   
hi, when i create a table, i can point the cache strategy using
shark.cache,
i think shark.cache=memory_only  means data are managed by spark,
  and
data are in the same jvm with excutor;   while
  shark.cache=tachyon
 means  data are managed by tachyon which is off heap, and data are
  not
   in
the same jvm with excutor,  so spark will load data from tachyon for
   each
query sql , so,  is  tachyon less efficient than memory_only cache
   strategy
 ?
if yes, can we let spark load all data once from tachyon  for all
 sql
   query
 if i want to use tachyon cache strategy since tachyon is more HA
 than
memory_only ?
   
  
 



 --
 Haoyuan Li
 AMPLab, EECS, UC Berkeley
 http://www.cs.berkeley.edu/~haoyuan/



on shark, is tachyon less efficient than memory_only cache strategy ?

2014-07-08 Thread qingyang li
hi, when i create a table, i can point the cache strategy using shark.cache,
i think shark.cache=memory_only  means data are managed by spark, and
data are in the same jvm with excutor;   while  shark.cache=tachyon
 means  data are managed by tachyon which is off heap, and data are not in
the same jvm with excutor,  so spark will load data from tachyon for each
query sql , so,  is  tachyon less efficient than memory_only cache strategy
 ?
if yes, can we let spark load all data once from tachyon  for all sql query
 if i want to use tachyon cache strategy since tachyon is more HA than
memory_only ?


Re: task always lost

2014-07-02 Thread qingyang li
executor always been removed.

someone encountered same issue
https://groups.google.com/forum/#!topic/spark-users/-mYn6BF-Y5Y

-
14/07/02 17:41:16 INFO storage.BlockManagerMasterActor: Trying to remove
executor 20140616-104524-1694607552-5050-26919-1 from BlockManagerMaster.
14/07/02 17:41:16 INFO storage.BlockManagerMaster: Removed
20140616-104524-1694607552-5050-26919-1 successfully in removeExecutor
14/07/02 17:41:16 DEBUG spark.MapOutputTrackerMaster: Increasing epoch to 10
14/07/02 17:41:16 INFO scheduler.DAGScheduler: Host gained which was in
lost list earlier: bigdata001
14/07/02 17:41:16 DEBUG scheduler.TaskSchedulerImpl: parentName: , name:
TaskSet_0, runningTasks: 0
14/07/02 17:41:16 DEBUG scheduler.TaskSchedulerImpl: parentName: , name:
TaskSet_0, runningTasks: 0
14/07/02 17:41:16 INFO scheduler.TaskSetManager: Starting task 0.0:0 as TID
12 on executor 20140616-143932-1694607552-5050-4080-3: bigdata004
(NODE_LOCAL)
14/07/02 17:41:16 INFO scheduler.TaskSetManager: Serialized task 0.0:0 as
10785 bytes in 1 ms
14/07/02 17:41:16 INFO scheduler.TaskSetManager: Starting task 0.0:1 as TID
13 on executor 20140616-104524-1694607552-5050-26919-3: bigdata002
(NODE_LOCAL


2014-07-02 12:01 GMT+08:00 qingyang li liqingyang1...@gmail.com:

 also this one in warning log:

 E0702 11:35:08.869998 17840 slave.cpp:2310] Container
 'af557235-2d5f-4062-aaf3-a747cb3cd0d1' for executor
 '20140616-104524-1694607552-5050-26919-1' of framework
 '20140702-113428-1694607552-5050-17766-' failed to start: Failed to
 fetch URIs for container 'af557235-2d5f-4062-aaf3-a747cb3cd0d1': exit
 status 32512


 2014-07-02 11:46 GMT+08:00 qingyang li liqingyang1...@gmail.com:

 Here is the log:

 E0702 10:32:07.599364 14915 slave.cpp:2686] Failed to unmonitor container
 for executor 20140616-104524-1694607552-5050-26919-1 of framework
 20140702-102939-1694607552-5050-14846-: Not monitored


 2014-07-02 1:45 GMT+08:00 Aaron Davidson ilike...@gmail.com:

 Can you post the logs from any of the dying executors?


 On Tue, Jul 1, 2014 at 1:25 AM, qingyang li liqingyang1...@gmail.com
 wrote:

  i am using mesos0.19 and spark0.9.0 ,  the mesos cluster is started,
 when I
  using spark-shell to submit one job, the tasks always lost.  here is
 the
  log:
  --
  14/07/01 16:24:27 INFO DAGScheduler: Host gained which was in lost list
  earlier: bigdata005
  14/07/01 16:24:27 INFO TaskSetManager: Starting task 0.0:1 as TID 4042
 on
  executor 20140616-143932-1694607552-5050-4080-2: bigdata005
 (PROCESS_LOCAL)
  14/07/01 16:24:27 INFO TaskSetManager: Serialized task 0.0:1 as 1570
 bytes
  in 0 ms
  14/07/01 16:24:28 INFO TaskSetManager: Re-queueing tasks for
  20140616-104524-1694607552-5050-26919-1 from TaskSet 0.0
  14/07/01 16:24:28 WARN TaskSetManager: Lost TID 4041 (task 0.0:0)
  14/07/01 16:24:28 INFO DAGScheduler: Executor lost:
  20140616-104524-1694607552-5050-26919-1 (epoch 3427)
  14/07/01 16:24:28 INFO BlockManagerMasterActor: Trying to remove
 executor
  20140616-104524-1694607552-5050-26919-1 from BlockManagerMaster.
  14/07/01 16:24:28 INFO BlockManagerMaster: Removed
  20140616-104524-1694607552-5050-26919-1 successfully in removeExecutor
  14/07/01 16:24:28 INFO TaskSetManager: Re-queueing tasks for
  20140616-143932-1694607552-5050-4080-2 from TaskSet 0.0
  14/07/01 16:24:28 WARN TaskSetManager: Lost TID 4042 (task 0.0:1)
  14/07/01 16:24:28 INFO DAGScheduler: Executor lost:
  20140616-143932-1694607552-5050-4080-2 (epoch 3428)
  14/07/01 16:24:28 INFO BlockManagerMasterActor: Trying to remove
 executor
  20140616-143932-1694607552-5050-4080-2 from BlockManagerMaster.
  14/07/01 16:24:28 INFO BlockManagerMaster: Removed
  20140616-143932-1694607552-5050-4080-2 successfully in removeExecutor
  14/07/01 16:24:28 INFO DAGScheduler: Host gained which was in lost list
  earlier: bigdata005
  14/07/01 16:24:28 INFO DAGScheduler: Host gained which was in lost list
  earlier: bigdata001
  14/07/01 16:24:28 INFO TaskSetManager: Starting task 0.0:1 as TID 4043
 on
  executor 20140616-143932-1694607552-5050-4080-2: bigdata005
 (PROCESS_LOCAL)
  14/07/01 16:24:28 INFO TaskSetManager: Serialized task 0.0:1 as 1570
 bytes
  in 0 ms
  14/07/01 16:24:28 INFO TaskSetManager: Starting task 0.0:0 as TID 4044
 on
  executor 20140616-104524-1694607552-5050-26919-1: bigdata001
  (PROCESS_LOCAL)
  14/07/01 16:24:28 INFO TaskSetManager: Serialized task 0.0:0 as 1570
 bytes
  in 0 ms
 
 
  it seems other guy has also encountered such problem,
 
 
 http://mail-archives.apache.org/mod_mbox/incubator-mesos-dev/201305.mbox/%3c201305161047069952...@nfs.iscas.ac.cn%3E
 






Re: task always lost

2014-07-01 Thread qingyang li
Here is the log:

E0702 10:32:07.599364 14915 slave.cpp:2686] Failed to unmonitor container
for executor 20140616-104524-1694607552-5050-26919-1 of framework
20140702-102939-1694607552-5050-14846-: Not monitored


2014-07-02 1:45 GMT+08:00 Aaron Davidson ilike...@gmail.com:

 Can you post the logs from any of the dying executors?


 On Tue, Jul 1, 2014 at 1:25 AM, qingyang li liqingyang1...@gmail.com
 wrote:

  i am using mesos0.19 and spark0.9.0 ,  the mesos cluster is started,
 when I
  using spark-shell to submit one job, the tasks always lost.  here is the
  log:
  --
  14/07/01 16:24:27 INFO DAGScheduler: Host gained which was in lost list
  earlier: bigdata005
  14/07/01 16:24:27 INFO TaskSetManager: Starting task 0.0:1 as TID 4042 on
  executor 20140616-143932-1694607552-5050-4080-2: bigdata005
 (PROCESS_LOCAL)
  14/07/01 16:24:27 INFO TaskSetManager: Serialized task 0.0:1 as 1570
 bytes
  in 0 ms
  14/07/01 16:24:28 INFO TaskSetManager: Re-queueing tasks for
  20140616-104524-1694607552-5050-26919-1 from TaskSet 0.0
  14/07/01 16:24:28 WARN TaskSetManager: Lost TID 4041 (task 0.0:0)
  14/07/01 16:24:28 INFO DAGScheduler: Executor lost:
  20140616-104524-1694607552-5050-26919-1 (epoch 3427)
  14/07/01 16:24:28 INFO BlockManagerMasterActor: Trying to remove executor
  20140616-104524-1694607552-5050-26919-1 from BlockManagerMaster.
  14/07/01 16:24:28 INFO BlockManagerMaster: Removed
  20140616-104524-1694607552-5050-26919-1 successfully in removeExecutor
  14/07/01 16:24:28 INFO TaskSetManager: Re-queueing tasks for
  20140616-143932-1694607552-5050-4080-2 from TaskSet 0.0
  14/07/01 16:24:28 WARN TaskSetManager: Lost TID 4042 (task 0.0:1)
  14/07/01 16:24:28 INFO DAGScheduler: Executor lost:
  20140616-143932-1694607552-5050-4080-2 (epoch 3428)
  14/07/01 16:24:28 INFO BlockManagerMasterActor: Trying to remove executor
  20140616-143932-1694607552-5050-4080-2 from BlockManagerMaster.
  14/07/01 16:24:28 INFO BlockManagerMaster: Removed
  20140616-143932-1694607552-5050-4080-2 successfully in removeExecutor
  14/07/01 16:24:28 INFO DAGScheduler: Host gained which was in lost list
  earlier: bigdata005
  14/07/01 16:24:28 INFO DAGScheduler: Host gained which was in lost list
  earlier: bigdata001
  14/07/01 16:24:28 INFO TaskSetManager: Starting task 0.0:1 as TID 4043 on
  executor 20140616-143932-1694607552-5050-4080-2: bigdata005
 (PROCESS_LOCAL)
  14/07/01 16:24:28 INFO TaskSetManager: Serialized task 0.0:1 as 1570
 bytes
  in 0 ms
  14/07/01 16:24:28 INFO TaskSetManager: Starting task 0.0:0 as TID 4044 on
  executor 20140616-104524-1694607552-5050-26919-1: bigdata001
  (PROCESS_LOCAL)
  14/07/01 16:24:28 INFO TaskSetManager: Serialized task 0.0:0 as 1570
 bytes
  in 0 ms
 
 
  it seems other guy has also encountered such problem,
 
 
 http://mail-archives.apache.org/mod_mbox/incubator-mesos-dev/201305.mbox/%3c201305161047069952...@nfs.iscas.ac.cn%3E
 



Re: task always lost

2014-07-01 Thread qingyang li
also this one in warning log:

E0702 11:35:08.869998 17840 slave.cpp:2310] Container
'af557235-2d5f-4062-aaf3-a747cb3cd0d1' for executor
'20140616-104524-1694607552-5050-26919-1' of framework
'20140702-113428-1694607552-5050-17766-' failed to start: Failed to
fetch URIs for container 'af557235-2d5f-4062-aaf3-a747cb3cd0d1': exit
status 32512


2014-07-02 11:46 GMT+08:00 qingyang li liqingyang1...@gmail.com:

 Here is the log:

 E0702 10:32:07.599364 14915 slave.cpp:2686] Failed to unmonitor container
 for executor 20140616-104524-1694607552-5050-26919-1 of framework
 20140702-102939-1694607552-5050-14846-: Not monitored


 2014-07-02 1:45 GMT+08:00 Aaron Davidson ilike...@gmail.com:

 Can you post the logs from any of the dying executors?


 On Tue, Jul 1, 2014 at 1:25 AM, qingyang li liqingyang1...@gmail.com
 wrote:

  i am using mesos0.19 and spark0.9.0 ,  the mesos cluster is started,
 when I
  using spark-shell to submit one job, the tasks always lost.  here is the
  log:
  --
  14/07/01 16:24:27 INFO DAGScheduler: Host gained which was in lost list
  earlier: bigdata005
  14/07/01 16:24:27 INFO TaskSetManager: Starting task 0.0:1 as TID 4042
 on
  executor 20140616-143932-1694607552-5050-4080-2: bigdata005
 (PROCESS_LOCAL)
  14/07/01 16:24:27 INFO TaskSetManager: Serialized task 0.0:1 as 1570
 bytes
  in 0 ms
  14/07/01 16:24:28 INFO TaskSetManager: Re-queueing tasks for
  20140616-104524-1694607552-5050-26919-1 from TaskSet 0.0
  14/07/01 16:24:28 WARN TaskSetManager: Lost TID 4041 (task 0.0:0)
  14/07/01 16:24:28 INFO DAGScheduler: Executor lost:
  20140616-104524-1694607552-5050-26919-1 (epoch 3427)
  14/07/01 16:24:28 INFO BlockManagerMasterActor: Trying to remove
 executor
  20140616-104524-1694607552-5050-26919-1 from BlockManagerMaster.
  14/07/01 16:24:28 INFO BlockManagerMaster: Removed
  20140616-104524-1694607552-5050-26919-1 successfully in removeExecutor
  14/07/01 16:24:28 INFO TaskSetManager: Re-queueing tasks for
  20140616-143932-1694607552-5050-4080-2 from TaskSet 0.0
  14/07/01 16:24:28 WARN TaskSetManager: Lost TID 4042 (task 0.0:1)
  14/07/01 16:24:28 INFO DAGScheduler: Executor lost:
  20140616-143932-1694607552-5050-4080-2 (epoch 3428)
  14/07/01 16:24:28 INFO BlockManagerMasterActor: Trying to remove
 executor
  20140616-143932-1694607552-5050-4080-2 from BlockManagerMaster.
  14/07/01 16:24:28 INFO BlockManagerMaster: Removed
  20140616-143932-1694607552-5050-4080-2 successfully in removeExecutor
  14/07/01 16:24:28 INFO DAGScheduler: Host gained which was in lost list
  earlier: bigdata005
  14/07/01 16:24:28 INFO DAGScheduler: Host gained which was in lost list
  earlier: bigdata001
  14/07/01 16:24:28 INFO TaskSetManager: Starting task 0.0:1 as TID 4043
 on
  executor 20140616-143932-1694607552-5050-4080-2: bigdata005
 (PROCESS_LOCAL)
  14/07/01 16:24:28 INFO TaskSetManager: Serialized task 0.0:1 as 1570
 bytes
  in 0 ms
  14/07/01 16:24:28 INFO TaskSetManager: Starting task 0.0:0 as TID 4044
 on
  executor 20140616-104524-1694607552-5050-26919-1: bigdata001
  (PROCESS_LOCAL)
  14/07/01 16:24:28 INFO TaskSetManager: Serialized task 0.0:0 as 1570
 bytes
  in 0 ms
 
 
  it seems other guy has also encountered such problem,
 
 
 http://mail-archives.apache.org/mod_mbox/incubator-mesos-dev/201305.mbox/%3c201305161047069952...@nfs.iscas.ac.cn%3E
 





Re: encounter jvm problem when integreation spark with mesos

2014-06-17 Thread qingyang li
somebody else has also encountered such problem:
http://mail-archives.apache.org/mod_mbox/spark-user/201404.mbox/%3cafc0d60983129f4f9fbad571aa422c9a5af8f...@mail-mbx1.ad.renci.org%3E


2014-06-17 12:31 GMT+08:00 Andrew Ash and...@andrewash.com:

 Hi qingyang,

 This looks like an issue with the open source version of the Java runtime
 (called OpenJDK) that causes the JVM to fail.  Can you try using the JVM
 released by Oracle and see if it has the same issue?

 Thanks!
 Andrew


 On Mon, Jun 16, 2014 at 9:24 PM, qingyang li liqingyang1...@gmail.com
 wrote:

  hi, I encounter  jvm problem when integreation spark with mesos,
  here is the log when i run spark-shell:
  -48ce131dc5af
  14/06/17 12:24:55 INFO HttpServer: Starting HTTP Server
  14/06/17 12:24:55 INFO SparkUI: Started Spark Web UI at
  http://bigdata001:4040
  #
  # A fatal error has been detected by the Java Runtime Environment:
  #
  #  SIGSEGV (0xb) at pc=0x7f94f4843d21, pid=5956, tid=140277175580416
  #
  # JRE version: OpenJDK Runtime Environment (7.0_51-b02) (build
  1.7.0_51-mockbuild_2014_01_15_01_39-b00)
  # Java VM: OpenJDK 64-Bit Server VM (24.45-b08 mixed mode linux-amd64
  compressed oops)
  # Problematic frame:
  # V  [libjvm.so+0x5e5d21]  JNI_CreateJavaVM+0x6551
  #
  # Core dump written. Default location:
  /home/zjw/spark/spark-0.9.0-incubating-bin-hadoop2/core or core.5956
  #
  # An error report file with more information is saved as:
  # /tmp/jvm-5956/hs_error.log
  #
  # If you would like to submit a bug report, please include
  # instructions on how to reproduce the bug and visit:
  #   http://icedtea.classpath.org/bugzilla
  #
  bin/spark-shell: line 101:  5956 Aborted (core dumped)
  $FWDIR/bin/spark-class $OPTIONS org.apache.spark.repl.Main $@
 



Re: encounter jvm problem when integreation spark with mesos

2014-06-17 Thread qingyang li
here is the core stack info:
-
(gdb) bt
#0  0x7fc0153fc925 in raise () from /lib64/libc.so.6
#1  0x7fc0153fe105 in abort () from /lib64/libc.so.6
#2  0x7fc014d78405 in os::abort(bool) ()
   from /home/zjw/jdk1.7/jdk1.7.0_51/jre/lib/amd64/server/libjvm.so
#3  0x7fc014ef7347 in VMError::report_and_die() ()
   from /home/zjw/jdk1.7/jdk1.7.0_51/jre/lib/amd64/server/libjvm.so
#4  0x7fc014d7cd8f in JVM_handle_linux_signal ()
   from /home/zjw/jdk1.7/jdk1.7.0_51/jre/lib/amd64/server/libjvm.so
#5  signal handler called
#6  0x7fc014b96ce9 in jni_GetByteArrayElements ()
   from /home/zjw/jdk1.7/jdk1.7.0_51/jre/lib/amd64/server/libjvm.so
#7  0x7fbff70f002c in GetByteArrayElements (env=value optimized out,
jobj=0x7fbf8c000f80)
at /home/zjw/jdk1.7/jdk1.7.0_51//include/jni.h:1668
#8  constructmesos::FrameworkInfo (env=value optimized out,
jobj=0x7fbf8c000f80)
at ../../src/java/jni/construct.cpp:123
#9  0x7fbff70f51c8 in
Java_org_apache_mesos_MesosSchedulerDriver_initialize (env=0x7fc010d189e8,
thiz=0x7fbfd94f6830) at
../../src/java/jni/org_apache_mesos_MesosSchedulerDriver.cpp:528



2014-06-17 16:12 GMT+08:00 andy petrella andy.petre...@gmail.com:

 Yep but no real resolution nor advances on this topic, since finally we've
 chosen to stick with a compatible version of Mesos (0.14.1 ftm).
 But I'm still convince it has to do with native libs clash :-s

  aℕdy ℙetrella
 about.me/noootsab
 [image: aℕdy ℙetrella on about.me]

 http://about.me/noootsab


 On Tue, Jun 17, 2014 at 9:57 AM, qingyang li liqingyang1...@gmail.com
 wrote:

  somebody else has also encountered such problem:
 
 
 http://mail-archives.apache.org/mod_mbox/spark-user/201404.mbox/%3cafc0d60983129f4f9fbad571aa422c9a5af8f...@mail-mbx1.ad.renci.org%3E
 
 
  2014-06-17 12:31 GMT+08:00 Andrew Ash and...@andrewash.com:
 
   Hi qingyang,
  
   This looks like an issue with the open source version of the Java
 runtime
   (called OpenJDK) that causes the JVM to fail.  Can you try using the
 JVM
   released by Oracle and see if it has the same issue?
  
   Thanks!
   Andrew
  
  
   On Mon, Jun 16, 2014 at 9:24 PM, qingyang li liqingyang1...@gmail.com
 
   wrote:
  
hi, I encounter  jvm problem when integreation spark with mesos,
here is the log when i run spark-shell:
-48ce131dc5af
14/06/17 12:24:55 INFO HttpServer: Starting HTTP Server
14/06/17 12:24:55 INFO SparkUI: Started Spark Web UI at
http://bigdata001:4040
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x7f94f4843d21, pid=5956,
  tid=140277175580416
#
# JRE version: OpenJDK Runtime Environment (7.0_51-b02) (build
1.7.0_51-mockbuild_2014_01_15_01_39-b00)
# Java VM: OpenJDK 64-Bit Server VM (24.45-b08 mixed mode linux-amd64
compressed oops)
# Problematic frame:
# V  [libjvm.so+0x5e5d21]  JNI_CreateJavaVM+0x6551
#
# Core dump written. Default location:
/home/zjw/spark/spark-0.9.0-incubating-bin-hadoop2/core or core.5956
#
# An error report file with more information is saved as:
# /tmp/jvm-5956/hs_error.log
#
# If you would like to submit a bug report, please include
# instructions on how to reproduce the bug and visit:
#   http://icedtea.classpath.org/bugzilla
#
bin/spark-shell: line 101:  5956 Aborted (core
 dumped)
$FWDIR/bin/spark-class $OPTIONS org.apache.spark.repl.Main $@
   
  
 



encounter jvm problem when integreation spark with mesos

2014-06-16 Thread qingyang li
hi, I encounter  jvm problem when integreation spark with mesos,
here is the log when i run spark-shell:
-48ce131dc5af
14/06/17 12:24:55 INFO HttpServer: Starting HTTP Server
14/06/17 12:24:55 INFO SparkUI: Started Spark Web UI at
http://bigdata001:4040
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x7f94f4843d21, pid=5956, tid=140277175580416
#
# JRE version: OpenJDK Runtime Environment (7.0_51-b02) (build
1.7.0_51-mockbuild_2014_01_15_01_39-b00)
# Java VM: OpenJDK 64-Bit Server VM (24.45-b08 mixed mode linux-amd64
compressed oops)
# Problematic frame:
# V  [libjvm.so+0x5e5d21]  JNI_CreateJavaVM+0x6551
#
# Core dump written. Default location:
/home/zjw/spark/spark-0.9.0-incubating-bin-hadoop2/core or core.5956
#
# An error report file with more information is saved as:
# /tmp/jvm-5956/hs_error.log
#
# If you would like to submit a bug report, please include
# instructions on how to reproduce the bug and visit:
#   http://icedtea.classpath.org/bugzilla
#
bin/spark-shell: line 101:  5956 Aborted (core dumped)
$FWDIR/bin/spark-class $OPTIONS org.apache.spark.repl.Main $@


how spark partition data when creating table like create table xxx as select * from xxx

2014-05-29 Thread qingyang li
hi,  spark-developers, i am using shark/spark,  and i am puzzled by such
question, and  can not find any info from the web, so i ask you.
1.  how spark partition data in memory when creating table when using
create table a tblproperties(shark.cache=memory) as select * from
table b  ,  in another words, how many rdds will be created ? how spark
decide the number of rdds ?

2.  how spark partition data on tachyon when creating table when using
create table a tblproperties(shark.cache=tachyon) as select * from
table b .  in another words, how many files will be created ? how spark
decide the number of files?
i found this settings about tachyon tachyon.user.default.block.size.byte
,  what it means?  could i set it to control each file size ?

thanks for any guiding  .


Re: can RDD be shared across mutil spark applications?

2014-05-18 Thread qingyang li
thanks for sharing,  I am using tachyon to store RDD now.


2014-05-18 12:02 GMT+08:00 Christopher Nguyen c...@adatao.com:

 Qing Yang, Andy is correct in answering your direct question.

 At the same time, depending on your context, you may be able to apply a
 pattern where you turn the single Spark application into a service, and
 multiple clients if that service can indeed share access to the same RDDs.

 Several groups have built apps based on this pattern, and we will also show
 something with this behavior at the upcoming Spark Summit (multiple users
 collaborating on named DDFs with the same underlying RDDs).

 Sent while mobile. Pls excuse typos etc.
 On May 18, 2014 9:40 AM, Andy Konwinski andykonwin...@gmail.com wrote:

  RDDs cannot currently be shared across multiple SparkContexts without
 using
  something like the Tachyon project (which is a separate
 project/codebase).
 
  Andy
  On May 16, 2014 2:14 PM, qingyang li liqingyang1...@gmail.com wrote:
 
  
  
 



can RDD be shared across mutil spark applications?

2014-05-16 Thread qingyang li



get -101 error code when running select query

2014-04-23 Thread qingyang li
hi,  i have started one sharkserver2 ,  and using java code to send query
to this server by hive jdbc,  but i got such error:
--
FAILED: Execution Error, return code -101 from shark.execution.SparkTask
org.apache.hive.service.cli.HiveSQLException: Error while processing
statement: FAILED: Execution Error, return code -101 from
shark.execution.SparkTask
at shark.server.SharkSQLOperation.run(SharkSQLOperation.scala:45)
at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:180)
at
org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:152)
at
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:203)
at
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1133)
at
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1118)
at
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at
org.apache.hive.service.auth.TUGIContainingProcessor$2.run(TUGIContainingProcessor.java:64)
at
org.apache.hive.service.auth.TUGIContainingProcessor$2.run(TUGIContainingProcessor.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at
org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:524)
at
org.apache.hive.service.auth.TUGIContainingProcessor.process(TUGIContainingProcessor.java:61)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

---
do anyone encounter this problem?


Re: get -101 error code when running select query

2014-04-23 Thread qingyang li
thanks for sharing,  my case is diffrent from yours,
i have set hive.server2.enable.doAs into false in  hive-site.xml,  then
that 101 error code disappeared.



2014-04-24 9:26 GMT+08:00 Madhu ma...@madhu.com:

 I have seen a similar error message when connecting to Hive through JDBC.
 This is just a guess on my part, but check your query. The error occurs if
 you have a select that includes a null literal with an alias like this:

 select a, b, null as c, d from foo

 In my case, rewriting the query to use an empty string or other literal
 instead of null worked:

 select a, b, '' as c, d from foo

 I think the problem is the lack of type information when supplying a null
 literal.



 --
 View this message in context:
 http://apache-spark-developers-list.1001551.n3.nabble.com/get-101-error-code-when-running-select-query-tp6377p6382.html
 Sent from the Apache Spark Developers List mailing list archive at
 Nabble.com.



Re: does shark0.9.1 work well with hadoop2.2.0 ?

2014-04-22 Thread qingyang li
this can help resolve protobuf version problem, too.
https://groups.google.com/forum/#!msg/shark-users/0pGIVQvaYfo/-43oaK8scNAJ


2014-04-20 23:53 GMT+08:00 Gordon Wang gw...@gopivotal.com:

 replacing the jar is not enough.
 You have to change protobuf dependency in shark's build script. and
 recompile the source.

 Protobuf 2.4.1 and 2.5.0 is not binary compatible.


 On Sun, Apr 20, 2014 at 6:45 PM, qingyang li liqingyang1...@gmail.com
 wrote:

  shark 0.9.1 is using protobuf 2.4.1 , but hadoop2.2.0 is using
  protobuf2.5.0,
  how can we make them work together?
  I have tried replace protobuf2.4.1 in shark with protobuf2.5.0, it does
 not
  work.
  I have also tried replacing protobuf2.5.0 in hadoop with shark's 2.4.1,
 it
  does not work too.
 



 --
 Regards
 Gordon Wang



does shark0.9.1 work well with hadoop2.2.0 ?

2014-04-20 Thread qingyang li
shark 0.9.1 is using protobuf 2.4.1 , but hadoop2.2.0 is using
protobuf2.5.0,
how can we make them work together?
I have tried replace protobuf2.4.1 in shark with protobuf2.5.0, it does not
work.
I have also tried replacing protobuf2.5.0 in hadoop with shark's 2.4.1, it
does not work too.


Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-03-26 Thread qingyang li
Egor, i encounter the same problem which you have asked in this thread:

http://mail-archives.apache.org/mod_mbox/spark-user/201402.mbox/%3CCAMrx5DwJVJS0g_FE7_2qwMu4Xf0y5VfV=tlyauv2kh5v4k6...@mail.gmail.com%3E

have you fixed this problem?

i am using shark to read a table which i have created on hdfs.

i found in shark lib_managed directory there are two protobuf*.jar:
[root@bigdata001 shark-0.9.0]# find . -name proto*.jar
./lib_managed/jars/org.spark-project.protobuf/protobuf-java/protobuf-java-2.4.1-shaded.jar
./lib_managed/bundles/com.google.protobuf/protobuf-java/protobuf-java-2.5.0.jar


my hadoop is using protobuf-java-2.5.0.jar .


Re: Error executing sql using shark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-03-26 Thread qingyang li
my spark can also work well with hadoop2.2.0
my shark can not work well with hadoop2.2.0 because protobuf version
problem.

in shark direcotry , i found two vesions of protobuf,  and they all are
loaded into classpath.

[root@bigdata001 shark-0.9.0]# find . -name proto*.jar

./lib_managed/jars/org.spark-project.protobuf/protobuf-java/protobuf-java-2.4.1-shaded.jar
./lib_managed/bundles/com.google.protobuf/protobuf-java/protobuf-java-2.5.0.jar




2014-03-27 2:26 GMT+08:00 yao yaosheng...@gmail.com:

 @qingyang, spark 0.9.0 works for me perfectly when accessing (read/write)
 data on hdfs. BTW, if you look at pom.xml, you have to choose yarn profile
 to compile spark, so that it won't include protobuf 2.4.1 in your final
 jars. Here is the command line we use to compile spark with hadoop 2.2:

 mvn -U -Dyarn.version=2.2.0 -Dhadoop.version=2.2.0 -Pyarn  -DskipTests
 package

 Thanks
 -Shengzhe


 On Wed, Mar 26, 2014 at 12:04 AM, qingyang li liqingyang1...@gmail.com
 wrote:

  Egor, i encounter the same problem which you have asked in this thread:
 
 
 
 http://mail-archives.apache.org/mod_mbox/spark-user/201402.mbox/%3CCAMrx5DwJVJS0g_FE7_2qwMu4Xf0y5VfV=tlyauv2kh5v4k6...@mail.gmail.com%3E
 
  have you fixed this problem?
 
  i am using shark to read a table which i have created on hdfs.
 
  i found in shark lib_managed directory there are two protobuf*.jar:
  [root@bigdata001 shark-0.9.0]# find . -name proto*.jar
 
 
 ./lib_managed/jars/org.spark-project.protobuf/protobuf-java/protobuf-java-2.4.1-shaded.jar
 
 
 ./lib_managed/bundles/com.google.protobuf/protobuf-java/protobuf-java-2.5.0.jar
 
 
  my hadoop is using protobuf-java-2.5.0.jar .
 



Re: how to config worker HA

2014-03-20 Thread qingyang li
can someone help me ?


2014-03-12 21:26 GMT+08:00 qingyang li liqingyang1...@gmail.com:

 in addition:
 on this site:
 https://spark.apache.org/docs/0.9.0/scala-programming-guide.html#hadoop-datasets
 ,
 i find RDD can be stored using a different *storage level on the web,
 and  *also find StorageLevel's attribute MEMORY_ONLY_2 .
 MEMORY_ONLY_2, Same as the levels above, but replicate each partition on
 two cluster nodes.
 1. is this one point of fault-tolerance ?
 2.if replicate each partition on two cluster nodes will help worker node
 HA ?
 3. if there is MEMORY_ONLY_3 which could replicate each partition on three
 cluster nodes?




 2014-03-12 12:11 GMT+08:00 qingyang li liqingyang1...@gmail.com:

 i have one table in memery,  when one worker becomes dead, i can not query
 data from that table. Here is it's storage status:


 RDD Name Storage LevelCached PartitionsFraction CachedSize in MemorySize
 on Disk
  http://192.168.1.101:4040/storage/rdd?id=47
 table01 Memory Deserialized 1x Replicated 119 88%   697.0 MB 0.0
 Bso, my question is:
 1. what meaning is  Memory Deserialized 1x Replicated ?
 2. how to config worker HA so that i can query data even one worker dead.





java.lang.IllegalAccessError: tried to access field org.apache.hadoop.hive.ql.security.HadoopDefaultAuthenticator.conf from class org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator

2014-03-17 Thread qingyang li
dear community,  i have used such command to build shark0.9:

export  SHARK_HADOOP_VERSION=2.2.0
sbt/sbt package

but when i run bin/shark, I got this error:

Exception in thread main java.lang.IllegalAccessError: tried to access
field org.apache.hadoop.hive.ql.security.HadoopDefaultAuthenticator.conf
from class org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator
at
org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator.setConf(ProxyUserAuthenticator.java:40)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at
org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthenticator(HiveUtils.java:365)
at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:270)
at shark.SharkCliDriver$.main(SharkCliDriver.scala:128)
at shark.SharkCliDriver.main(SharkCliDriver.scala)

it seems conf is one private variable,
do any one know how to resovle this problem ?


Re: if there is shark 0.9 build can be download?

2014-03-12 Thread qingyang li
it is too slow to build shark using the latest source code ,  is there
shark 0.9 build can be download?


2014-03-11 9:29 GMT+08:00 qingyang li liqingyang1...@gmail.com:

 Does anyone know  if there is shark 0.9 build can be download?
 if not, when there will be shark 0.9 build?