[jira] [Commented] (BEAM-8251) Add worker_region and worker_zone options

2020-10-13 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17213478#comment-17213478
 ] 

Jing Chen commented on BEAM-8251:
-

seems like all subtasks are done, we shall be able to close the issue.

> Add worker_region and worker_zone options
> -
>
> Key: BEAM-8251
> URL: https://issues.apache.org/jira/browse/BEAM-8251
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Kyle Weaver
>Priority: P3
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> We are refining the way the user specifies worker regions and zones to the 
> Dataflow service. We need to add worker_region and worker_zone pipeline 
> options that will be preferred over the old experiments=worker_region and 
> --zone flags. I will create subtasks for adding these options to each SDK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8253) (Go SDK) Add worker_region and worker_zone options

2020-10-13 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen updated BEAM-8253:

Status: Resolved  (was: Open)

> (Go SDK) Add worker_region and worker_zone options
> --
>
> Key: BEAM-8253
> URL: https://issues.apache.org/jira/browse/BEAM-8253
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow, sdk-go
>Reporter: Kyle Weaver
>Assignee: Jing Chen
>Priority: P3
>  Labels: stale-assigned
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8253) (Go SDK) Add worker_region and worker_zone options

2020-10-13 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen updated BEAM-8253:

Status: Resolved  (was: Resolved)

> (Go SDK) Add worker_region and worker_zone options
> --
>
> Key: BEAM-8253
> URL: https://issues.apache.org/jira/browse/BEAM-8253
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow, sdk-go
>Reporter: Kyle Weaver
>Assignee: Jing Chen
>Priority: P3
>  Labels: stale-assigned
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8253) (Go SDK) Add worker_region and worker_zone options

2020-10-10 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen reassigned BEAM-8253:
---

Assignee: Jing Chen

> (Go SDK) Add worker_region and worker_zone options
> --
>
> Key: BEAM-8253
> URL: https://issues.apache.org/jira/browse/BEAM-8253
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow, sdk-go
>Reporter: Kyle Weaver
>Assignee: Jing Chen
>Priority: P3
>  Labels: stale-assigned
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9959) Mistakes Computing Composite Inputs and Outputs

2020-10-09 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen reassigned BEAM-9959:
---

Assignee: Jing Chen

> Mistakes Computing Composite Inputs and Outputs
> ---
>
> Key: BEAM-9959
> URL: https://issues.apache.org/jira/browse/BEAM-9959
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Jing Chen
>Priority: P3
>
> The Go SDK uses a Scope object to manage beam Composites.
> A bug was discovered when consuming a PCollection in both the composite that 
> created it, and in a separate composite.
> Further, the Go SDK should verify that the root hypergraph structure is a DAG 
> and provides a reasonable error.  In particular, the leaf nodes of the graph 
> could form a DAG, but due to how the beam.Scope object is used, might cause 
> the hypergraph to not be a DAG.
> Eg. It's possible to write the following in the Go SDK.
>  PTransforms A, B, C and PCollections colA, colB, and Composites a, b.
> A and C are in a, and B are in b.
> A generates colA
> B consumes colA, and generates colB.
> C consumes colA and colB.
> ```
> a := s.Scope(a)
> b := s.Scope(b)
> colA := beam.Impulse(*a*)
> colB := beam.ParDo(*b*, , colA)
> beam.ParDo0(*a*, , colA, beam.SideInput{colB})
> ```
> If it doesn't already, the Go SDK must emit a clear error, and fail pipeline 
> construction.
> If the affected composites are roots in the graph, the cycle prevents being 
> able to topologically sort the root ptransforms for the pipeline graph, which 
> can adversely affect runners.
> The recommendation is always to wrap uses of scope in functions or other 
> scopes to prevent such incorrect constructions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8253) (Go SDK) Add worker_region and worker_zone options

2020-10-09 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210699#comment-17210699
 ] 

Jing Chen commented on BEAM-8253:
-

do you mind sharing more information of the task, i may be able to pick it up

> (Go SDK) Add worker_region and worker_zone options
> --
>
> Key: BEAM-8253
> URL: https://issues.apache.org/jira/browse/BEAM-8253
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow, sdk-go
>Reporter: Kyle Weaver
>Priority: P3
>  Labels: stale-assigned
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8017) Plumb errors and remove panics from package graphx

2020-10-08 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen updated BEAM-8017:

Status: Resolved  (was: Open)

> Plumb errors and remove panics from package graphx
> --
>
> Key: BEAM-8017
> URL: https://issues.apache.org/jira/browse/BEAM-8017
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Jing Chen
>Priority: P3
>  Labels: Novice, beginner, noob, starter
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The graphx package, and in particular serialize.go and coder.go should be 
> returning errors back up, rather than panicing when issues occur deeper when 
> marshalling types. It makes errors harder to follow since there's now a less 
> necessary panic trace to skip, rather than a clearly constructed error 
> message.
> Not difficult, but may be tedious. Requires plumbing the errors and 
> handling/wrapping them appropriately instead of using panic. Most error 
> handling is presently correctly wrapped anyway.
> The graphx package as a rule is intended for beam internal use, and not part 
> of the user surface, so making the API changes (which aren't backwards 
> compatible) isn't the worst. Most of the affected methods are unexported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8017) Plumb errors and remove panics from package graphx

2020-09-23 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen reassigned BEAM-8017:
---

Assignee: Jing Chen

> Plumb errors and remove panics from package graphx
> --
>
> Key: BEAM-8017
> URL: https://issues.apache.org/jira/browse/BEAM-8017
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Jing Chen
>Priority: P3
>  Labels: Novice, beginner, noob, starter
>
> The graphx package, and in particular serialize.go and coder.go should be 
> returning errors back up, rather than panicing when issues occur deeper when 
> marshalling types. It makes errors harder to follow since there's now a less 
> necessary panic trace to skip, rather than a clearly constructed error 
> message.
> Not difficult, but may be tedious. Requires plumbing the errors and 
> handling/wrapping them appropriately instead of using panic. Most error 
> handling is presently correctly wrapped anyway.
> The graphx package as a rule is intended for beam internal use, and not part 
> of the user surface, so making the API changes (which aren't backwards 
> compatible) isn't the worst. Most of the affected methods are unexported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-10660) [Go SDK] Implement Timer support

2020-09-17 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-10660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17197443#comment-17197443
 ] 

Jing Chen commented on BEAM-10660:
--

Hi, do you have more details on the issue

> [Go SDK] Implement Timer support
> 
>
> Key: BEAM-10660
> URL: https://issues.apache.org/jira/browse/BEAM-10660
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-go
>Reporter: Robert Burke
>Priority: P2
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-10159) Support Reading data from Databricks Delta

2020-06-30 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-10159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17149026#comment-17149026
 ] 

Jing Chen commented on BEAM-10159:
--

Is there any analog within beam that support such reading?

> Support Reading data from Databricks Delta
> --
>
> Key: BEAM-10159
> URL: https://issues.apache.org/jira/browse/BEAM-10159
> Project: Beam
>  Issue Type: New Feature
>  Components: io-ideas
>Reporter: Ismaël Mejía
>Priority: P2
>
> Databricks Delta is an open source storage layer on top of different 
> filesystems. The current implementation of Delta is strongly coupled with 
> Spark so we cannot rely on it because it would break Beam portability.
> However now there is an open specification for Delta's protocol.
> https://github.com/delta-io/delta/blob/master/PROTOCOL.md
> Another possible approach could be to investigate how if Beam could use a 
> manifest based approach like Presto does:
> https://docs.databricks.com/delta/presto-integration.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-5504) PubsubAvroTable

2020-06-30 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen reassigned BEAM-5504:
---

Assignee: Jing Chen

> PubsubAvroTable
> ---
>
> Key: BEAM-5504
> URL: https://issues.apache.org/jira/browse/BEAM-5504
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Jing Chen
>Priority: P2
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-5504) PubsubAvroTable

2020-01-13 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014758#comment-17014758
 ] 

Jing Chen commented on BEAM-5504:
-

[~amaliujia] curious if there is an easy way to run integration locally.

> PubsubAvroTable
> ---
>
> Key: BEAM-5504
> URL: https://issues.apache.org/jira/browse/BEAM-5504
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Jing Chen
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-5504) PubsubAvroTable

2020-01-13 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014758#comment-17014758
 ] 

Jing Chen edited comment on BEAM-5504 at 1/14/20 12:59 AM:
---

[~amaliujia] curious if there is an easy way to run integration tests locally.


was (Author: jingc):
[~amaliujia] curious if there is an easy way to run integration locally.

> PubsubAvroTable
> ---
>
> Key: BEAM-5504
> URL: https://issues.apache.org/jira/browse/BEAM-5504
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Jing Chen
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (BEAM-5504) PubsubAvroTable

2020-01-01 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-5504 started by Jing Chen.
---
> PubsubAvroTable
> ---
>
> Key: BEAM-5504
> URL: https://issues.apache.org/jira/browse/BEAM-5504
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Jing Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8801) PubsubMessageToRow should not check useFlatSchema() in processElement

2019-12-06 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16990289#comment-16990289
 ] 

Jing Chen commented on BEAM-8801:
-

[~bhulette] thank you for the explanation.

I am able to pick up the ticket. 

I am working on [BEAM-5504|https://issues.apache.org/jira/browse/BEAM-5504] to 
add avro support for pubsub table, it might benefit from the refactoring.

 

Feel free to let me know if you have any comments :)

> PubsubMessageToRow should not check useFlatSchema() in processElement
> -
>
> Key: BEAM-8801
> URL: https://issues.apache.org/jira/browse/BEAM-8801
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>
> Currently we check useFlatSchema() for every element that's processed. 
> Instead, we should check it once at pipeline construction time. See 
> [comment|https://github.com/apache/beam/pull/10158#discussion_r348805530].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8801) PubsubMessageToRow should not check useFlatSchema() in processElement

2019-12-06 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen reassigned BEAM-8801:
---

Assignee: Jing Chen  (was: Brian Hulette)

> PubsubMessageToRow should not check useFlatSchema() in processElement
> -
>
> Key: BEAM-8801
> URL: https://issues.apache.org/jira/browse/BEAM-8801
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Brian Hulette
>Assignee: Jing Chen
>Priority: Major
>
> Currently we check useFlatSchema() for every element that's processed. 
> Instead, we should check it once at pipeline construction time. See 
> [comment|https://github.com/apache/beam/pull/10158#discussion_r348805530].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8801) PubsubMessageToRow should not check useFlatSchema() in processElement

2019-12-03 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987401#comment-16987401
 ] 

Jing Chen commented on BEAM-8801:
-

Happened to see TODO at the code base, I curious what you mean by `pipeline 
construction time`

> PubsubMessageToRow should not check useFlatSchema() in processElement
> -
>
> Key: BEAM-8801
> URL: https://issues.apache.org/jira/browse/BEAM-8801
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>
> Currently we check useFlatSchema() for every element that's processed. 
> Instead, we should check it once at pipeline construction time. See 
> [comment|https://github.com/apache/beam/pull/10158#discussion_r348805530].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-5504) PubsubAvroTable

2019-12-03 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987351#comment-16987351
 ] 

Jing Chen commented on BEAM-5504:
-

thanks for the heads up.

 

will put both in the same pr.

> PubsubAvroTable
> ---
>
> Key: BEAM-5504
> URL: https://issues.apache.org/jira/browse/BEAM-5504
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Jing Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-5504) PubsubAvroTable

2019-12-03 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen reassigned BEAM-5504:
---

Assignee: Jing Chen

> PubsubAvroTable
> ---
>
> Key: BEAM-5504
> URL: https://issues.apache.org/jira/browse/BEAM-5504
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Jing Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-5504) PubsubAvroTable

2019-12-03 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987277#comment-16987277
 ] 

Jing Chen commented on BEAM-5504:
-

[~amaliujia] just need to clarify that we need to support avro format in both 
pubsub read and write, am i correct?

> PubsubAvroTable
> ---
>
> Key: BEAM-5504
> URL: https://issues.apache.org/jira/browse/BEAM-5504
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-7996) Add support for remaining data types in python RowCoder

2019-11-30 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen reassigned BEAM-7996:
---

Assignee: Jing Chen

> Add support for remaining data types in python RowCoder 
> 
>
> Key: BEAM-7996
> URL: https://issues.apache.org/jira/browse/BEAM-7996
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Assignee: Jing Chen
>Priority: Major
>
> In the initial [python RowCoder 
> implementation|https://github.com/apache/beam/pull/9188] we only added 
> support for the data types that already had coders in the Python SDK. We 
> should add support for the remaining data types that are not currently 
> supported:
> * INT8 (ByteCoder in Java)
> * INT16 (BigEndianShortCoder in Java)
> * FLOAT (FloatCoder in Java)
> * BOOLEAN (BooleanCoder in Java)
> * Map (MapCoder in Java)
> We might consider making those coders standard so they can be tested 
> independently from RowCoder in standard_coders.yaml. Or, if we don't do that 
> we should probably add a more robust testing framework for RowCoder itself, 
> because it will be challenging to test all of these types as part of the 
> RowCoder tests in standard_coders.yaml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-8406) TextTable support JSON format

2019-11-28 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen resolved BEAM-8406.
-
Fix Version/s: Not applicable
   Resolution: Done

> TextTable support JSON format
> -
>
> Key: BEAM-8406
> URL: https://issues.apache.org/jira/browse/BEAM-8406
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Jing Chen
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Have a JSON table implementation similar to [1].
> [1]: 
> https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/text/TextTable.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-5504) Pubsub Write by BeamSQL

2019-11-27 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16983874#comment-16983874
 ] 

Jing Chen commented on BEAM-5504:
-

[~amaliujia] do you mind sharing more context and information?

> Pubsub Write by BeamSQL
> ---
>
> Key: BEAM-5504
> URL: https://issues.apache.org/jira/browse/BEAM-5504
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (BEAM-8298) Implement state caching for side inputs

2019-11-27 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-8298 started by Jing Chen.
---
> Implement state caching for side inputs
> ---
>
> Key: BEAM-8298
> URL: https://issues.apache.org/jira/browse/BEAM-8298
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, sdk-py-harness
>Reporter: Maximilian Michels
>Assignee: Jing Chen
>Priority: Major
>
> Caching is currently only implemented for user state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-3788) Implement a Kafka IO for Python SDK

2019-11-26 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16982762#comment-16982762
 ] 

Jing Chen commented on BEAM-3788:
-

i am curious if there is a plan to integrate confluent's schema registry etc or 
simply vanilla kafka

> Implement a Kafka IO for Python SDK
> ---
>
> Key: BEAM-3788
> URL: https://issues.apache.org/jira/browse/BEAM-3788
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Priority: Major
>
> This will be implemented using the Splittable DoFn framework.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (BEAM-8406) TextTable support JSON format

2019-11-26 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-8406 started by Jing Chen.
---
> TextTable support JSON format
> -
>
> Key: BEAM-8406
> URL: https://issues.apache.org/jira/browse/BEAM-8406
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Jing Chen
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Have a JSON table implementation similar to [1].
> [1]: 
> https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/text/TextTable.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8406) TextTable support JSON format

2019-11-22 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen reassigned BEAM-8406:
---

Assignee: Jing Chen

> TextTable support JSON format
> -
>
> Key: BEAM-8406
> URL: https://issues.apache.org/jira/browse/BEAM-8406
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Jing Chen
>Priority: Major
>
> Have a JSON table implementation similar to [1].
> [1]: 
> https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/text/TextTable.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8298) Implement state caching for side inputs

2019-11-08 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen reassigned BEAM-8298:
---

Assignee: Jing Chen

> Implement state caching for side inputs
> ---
>
> Key: BEAM-8298
> URL: https://issues.apache.org/jira/browse/BEAM-8298
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, sdk-py-harness
>Reporter: Maximilian Michels
>Assignee: Jing Chen
>Priority: Major
>
> Caching is currently only implemented for user state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8338) Support ES 7.x for ElasticsearchIO

2019-11-08 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969957#comment-16969957
 ] 

Jing Chen commented on BEAM-8338:
-

[~mbrunat] thanks for heads up. feel free to self assign the issue :)

> Support ES 7.x for ElasticsearchIO
> --
>
> Key: BEAM-8338
> URL: https://issues.apache.org/jira/browse/BEAM-8338
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Michal Brunát
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Elasticsearch has released 7.4 but ElasticsearchIO only supports 2x,5.x,6.x.
>  We should support ES 7.x for ElasticsearchIO.
>  [https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html]
>  
> [https://github.com/apache/beam/blob/master/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8338) Support ES 7.x for ElasticsearchIO

2019-11-08 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen reassigned BEAM-8338:
---

Assignee: (was: Jing Chen)

> Support ES 7.x for ElasticsearchIO
> --
>
> Key: BEAM-8338
> URL: https://issues.apache.org/jira/browse/BEAM-8338
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Michal Brunát
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Elasticsearch has released 7.4 but ElasticsearchIO only supports 2x,5.x,6.x.
>  We should support ES 7.x for ElasticsearchIO.
>  [https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html]
>  
> [https://github.com/apache/beam/blob/master/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8298) Implement state caching for side inputs

2019-11-07 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969667#comment-16969667
 ] 

Jing Chen commented on BEAM-8298:
-

[~mxm] would you mind sharing details on the issue?

I am kinda interested on working on it if it is still free

> Implement state caching for side inputs
> ---
>
> Key: BEAM-8298
> URL: https://issues.apache.org/jira/browse/BEAM-8298
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Maximilian Michels
>Priority: Major
>
> Caching is currently only implemented for user state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8338) Support ES 7.x for ElasticsearchIO

2019-11-06 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen reassigned BEAM-8338:
---

Assignee: Jing Chen

> Support ES 7.x for ElasticsearchIO
> --
>
> Key: BEAM-8338
> URL: https://issues.apache.org/jira/browse/BEAM-8338
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Michal Brunát
>Assignee: Jing Chen
>Priority: Major
>
> Elasticsearch has released 7.4 but ElasticsearchIO only supports 2x,5.x,6.x.
>  We should support ES 7.x for ElasticsearchIO.
>  [https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html]
>  
> [https://github.com/apache/beam/blob/master/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8376) Add FirestoreIO connector to Java SDK

2019-11-06 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968627#comment-16968627
 ] 

Jing Chen commented on BEAM-8376:
-

chat with [~chamikara] offline, there are ongoing efforts on the issue. 
unassign myself.

> Add FirestoreIO connector to Java SDK
> -
>
> Key: BEAM-8376
> URL: https://issues.apache.org/jira/browse/BEAM-8376
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Stefan Djelekar
>Priority: Major
>
> Motivation:
> There is no Firestore connector for Java SDK at the moment.
> Having it will enhance the integrations with database options on the Google 
> Cloud Platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8376) Add FirestoreIO connector to Java SDK

2019-11-06 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen reassigned BEAM-8376:
---

Assignee: (was: Jing Chen)

> Add FirestoreIO connector to Java SDK
> -
>
> Key: BEAM-8376
> URL: https://issues.apache.org/jira/browse/BEAM-8376
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Stefan Djelekar
>Priority: Major
>
> Motivation:
> There is no Firestore connector for Java SDK at the moment.
> Having it will enhance the integrations with database options on the Google 
> Cloud Platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files

2019-11-06 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen reassigned BEAM-8561:
---

Assignee: (was: Jing Chen)

> Add ThriftIO to Support IO for Thrift Files
> ---
>
> Key: BEAM-8561
> URL: https://issues.apache.org/jira/browse/BEAM-8561
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-files
>Reporter: Chris Larsen
>Priority: Minor
>
> Similar to AvroIO it would be very useful to support reading and writing 
> to/from Thrift files with a native connector. 
> Functionality would include:
>  # read() - Reading from one or more Thrift files.
>  # write() - Writing to one or more Thrift files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files

2019-11-06 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968555#comment-16968555
 ] 

Jing Chen commented on BEAM-8561:
-

[~clarsen] that would be great, reassign the ticket to you, let me know if 
there is anything i could help :)

> Add ThriftIO to Support IO for Thrift Files
> ---
>
> Key: BEAM-8561
> URL: https://issues.apache.org/jira/browse/BEAM-8561
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-files
>Reporter: Chris Larsen
>Assignee: Jing Chen
>Priority: Minor
>
> Similar to AvroIO it would be very useful to support reading and writing 
> to/from Thrift files with a native connector. 
> Functionality would include:
>  # read() - Reading from one or more Thrift files.
>  # write() - Writing to one or more Thrift files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files

2019-11-06 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen reassigned BEAM-8561:
---

Assignee: Jing Chen

> Add ThriftIO to Support IO for Thrift Files
> ---
>
> Key: BEAM-8561
> URL: https://issues.apache.org/jira/browse/BEAM-8561
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-files
>Reporter: Chris Larsen
>Assignee: Jing Chen
>Priority: Minor
>
> Similar to AvroIO it would be very useful to support reading and writing 
> to/from Thrift files with a native connector. 
> Functionality would include:
>  # read() - Reading from one or more Thrift files.
>  # write() - Writing to one or more Thrift files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7732) Allow access to SpannerOptions in Beam

2019-11-06 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968186#comment-16968186
 ] 

Jing Chen commented on BEAM-7732:
-

i am curious if the issue is still open, [~nielm]

> Allow access to SpannerOptions in Beam
> --
>
> Key: BEAM-7732
> URL: https://issues.apache.org/jira/browse/BEAM-7732
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.12.0, 2.13.0
>Reporter: Niel Markwick
>Priority: Minor
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Beam hides the 
> [SpannerOptions|https://github.com/googleapis/google-cloud-java/blob/master/google-cloud-clients/google-cloud-spanner/src/main/java/com/google/cloud/spanner/SpannerOptions.java]
>  object behind a 
> [SpannerConfig|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerConfig.java]
>  object because the SpannerOptions object is not serializable. 
> This means that the only options that can be set are those that can be 
> specified in SpannerConfig - limited to host, project, instance, database.
> Suggestion: add the possibility to set a SpannerOptionsFactory in 
> SpannerConfig:
> {code:java}
> public interface SpannerOptionsFactory extends Serializable {
>    public SpannerOptions create();
> }
> {code}
> This would allow the user use this factory class to specify custom 
> SpannerOptions before they are passed onto the connectToSpanner() method; 
> connectToSpanner() would then become: 
> {code:java}
> public SpannerAccessor connectToSpanner() {
>   
>   SpannerOptions.Builder builder = spannerOptionsFactory.create().toBuilder();
>   // rest of connectToSpanner follows, setting project, host, etc.
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8376) Add FirestoreIO connector to Java SDK

2019-11-06 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968160#comment-16968160
 ] 

Jing Chen commented on BEAM-8376:
-

[~chamikara] do you have any context you could share?

 

I may be able to ping you offline for more info and resource on this if you are 
cool with that.

> Add FirestoreIO connector to Java SDK
> -
>
> Key: BEAM-8376
> URL: https://issues.apache.org/jira/browse/BEAM-8376
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Stefan Djelekar
>Priority: Major
>
> Motivation:
> There is no Firestore connector for Java SDK at the moment.
> Having it will enhance the integrations with database options on the Google 
> Cloud Platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8376) Add FirestoreIO connector to Java SDK

2019-11-06 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen reassigned BEAM-8376:
---

Assignee: Jing Chen

> Add FirestoreIO connector to Java SDK
> -
>
> Key: BEAM-8376
> URL: https://issues.apache.org/jira/browse/BEAM-8376
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Stefan Djelekar
>Assignee: Jing Chen
>Priority: Major
>
> Motivation:
> There is no Firestore connector for Java SDK at the moment.
> Having it will enhance the integrations with database options on the Google 
> Cloud Platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-3493) Prevent users from "implementing" PipelineOptions

2019-11-05 Thread Jing Chen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968095#comment-16968095
 ] 

Jing Chen commented on BEAM-3493:
-

The issue should have been fixed by 
[https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsValidator.java#L69]
 and L70.

 

I am about to close the ticket once either [~kenn] or [~lcwik] could confirm it.

 

Thanks

Jing

> Prevent users from "implementing" PipelineOptions
> -
>
> Key: BEAM-3493
> URL: https://issues.apache.org/jira/browse/BEAM-3493
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Jing Chen
>Priority: Minor
>  Labels: newbie, starter
>
> I've seen a user implement \{{PipelineOptions}}. This implies that it is 
> backwards-incompatible to add new options, which is of course not our intent. 
> We should at least document very loudly that it is not to be implemented, and 
> preferably have some automation that will fail on load if they have 
> implemented it. Ideas?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-3493) Prevent users from "implementing" PipelineOptions

2019-11-04 Thread Jing Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen reassigned BEAM-3493:
---

Assignee: Jing Chen

> Prevent users from "implementing" PipelineOptions
> -
>
> Key: BEAM-3493
> URL: https://issues.apache.org/jira/browse/BEAM-3493
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Jing Chen
>Priority: Minor
>  Labels: newbie, starter
>
> I've seen a user implement \{{PipelineOptions}}. This implies that it is 
> backwards-incompatible to add new options, which is of course not our intent. 
> We should at least document very loudly that it is not to be implemented, and 
> preferably have some automation that will fail on load if they have 
> implemented it. Ideas?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (BEAM-2857) Create FileIO in Python

2019-03-05 Thread Jing Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Chen updated BEAM-2857:

Comment: was deleted

(was: i want to take a stab at this ticket, can you please add me to 
contributor list? Thanks)

> Create FileIO in Python
> ---
>
> Key: BEAM-2857
> URL: https://issues.apache.org/jira/browse/BEAM-2857
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: gsoc, gsoc2019, mentor, triaged
>
> Beam Java has a FileIO with operations: match()/matchAll(), readMatches(), 
> which together cover the majority of needs for general-purpose file 
> ingestion. Beam Python should have something similar.
> An early design document for this: https://s.apache.org/fileio-beam-python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)