[ 
https://issues.apache.org/jira/browse/PIG-5080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16091220#comment-16091220
 ] 

liyunzhang_intel commented on PIG-5080:
---------------------------------------

[~jeffzhang]: thanks for your patch. What's the benefit of store pig as spark 
temporary table? To use the result of pig script in another spark engine like 
spark sql?
if we use dataframe to replace rdd, what's the benefit? and can you show the 
detail performance improvement in some benchmark-test?

> Support store alias as spark table
> ----------------------------------
>
>                 Key: PIG-5080
>                 URL: https://issues.apache.org/jira/browse/PIG-5080
>             Project: Pig
>          Issue Type: New Feature
>          Components: spark
>    Affects Versions: 0.16.0
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>             Fix For: 0.17.1
>
>         Attachments: PIG-5080-1.patch, PIG-5080-2.patch
>
>
> The purpose is that I'd like to take advantage of both pig and hive. 
> Pig-latin has powerful data flow expression ability which is useful for ETL 
> while hive is good at query. 
> The scenario is that I'd like to store pig alias as spark temporary table 
> (cache can be optional). And I have an another spark engine which share the 
> same SparkContext (in the same JVM) to query the table.
> Please close this ticket if it is already supported. I didn't go through all 
> the features of pig-spark.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to