[jira] [Updated] (BEAM-8647) Remove .mailmap from the sources

2020-06-01 Thread Romain Manni-Bucau (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Manni-Bucau updated BEAM-8647:
-
Labels:   (was: stale-P2)

> Remove .mailmap from the sources
> 
>
> Key: BEAM-8647
> URL: https://issues.apache.org/jira/browse/BEAM-8647
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Romain Manni-Bucau
>Priority: P2
>
> Hi,
>  
> .mailmap manipulates individuals data which are considered "personal" (name, 
> email etc)
> AFAIK Apache/Beam is not allowed to do it straight, in particular for EU 
> citizens (_GDPR)._
> Can the file be removed since it is not used by the beam project (at least 
> apache/beam repo)?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7890) Rework dependency stack to ensure beam stay lightweight + embeddable

2020-03-05 Thread Romain Manni-Bucau (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17052095#comment-17052095
 ] 

Romain Manni-Bucau commented on BEAM-7890:
--

side note: still an issue and design "bug" of beam core so not sure "won't fix 
" is that relevant :(

> Rework dependency stack to ensure beam stay lightweight + embeddable
> 
>
> Key: BEAM-7890
> URL: https://issues.apache.org/jira/browse/BEAM-7890
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core, runner-spark, sdk-java-core
>Affects Versions: 2.14.0
>Reporter: Romain Manni-Bucau
>Priority: Major
> Fix For: Not applicable
>
>
> Currently, beam entry cost is > 30M:
>  
> {code:java}
> -rw-r--r-- 1 rmannibucau rmannibucau  13M févr. 17 11:45 
> beam-vendor-grpc-1_13_1-0.2.jar
> -rw-r--r-- 1 rmannibucau rmannibucau 8,7M août   5 10:22 
> beam-sdks-java-core-2.14.0.jar
> -rw-r--r-- 1 rmannibucau rmannibucau 2,6M août   5 10:25 
> beam-vendor-sdks-java-extensions-protobuf-2.14.0.jar
> -rw-r--r-- 1 rmannibucau rmannibucau 2,6M févr. 17 11:45 
> beam-vendor-guava-20_0-0.1.jar
> -rw-r--r-- 1 rmannibucau rmannibucau 1,4M août   5 10:21 
> beam-model-pipeline-2.14.0.jar
> -rw-r--r-- 1 rmannibucau rmannibucau 825K août   5 10:25 
> beam-model-fn-execution-2.14.0.jar
> -rw-r--r-- 1 rmannibucau rmannibucau 470K août   5 10:21 
> beam-model-job-management-2.14.0.jar
> -rw-r--r-- 1 rmannibucau rmannibucau 446K août   5 10:25 
> beam-runners-core-construction-java-2.14.0.jar
> -rw-r--r-- 1 rmannibucau rmannibucau 378K août   5 10:24 
> beam-runners-core-java-2.14.0.jar{code}
> Due to its embed nature (generally sent with the job) it should stay as 
> lightweight as possible. I see a few actions which can help to make back beam 
> integrable:
>  
>  # Make all the polyglotism layer optional and excludable, this is never 
> needed for several jobs and this additional weight is a clear regression on 
> the packaging side of beam,
>  # Vendoring and sdk dependencies are generally luxuray (who needs a library 
> to do a new ArrayList<>() in 2019 ;)) so most of the dependencies can be 
> dropped, vendoring can be made very lightweight - to not say optional for the 
> sdk java core
> At the end a reasonable limit for a runner like spark - not the direct one 
> which reimplements all the logic by design - would be around 5M of deps IMHO.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-5495) PipelineResources algorithm is not working in most environments

2019-12-18 Thread Romain Manni-Bucau (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16999186#comment-16999186
 ] 

Romain Manni-Bucau commented on BEAM-5495:
--

[~mxm]think you got abused by github ;) (see 
[https://github.com/classgraph/classgraph/graphs/contributors] vs 
[https://github.com/apache/geronimo-xbean/graphs/contributors]). Anyway, not a 
blocker. Guess it should be made provided/optional in the deployed pom since it 
is a flippable feature (at least at sdk level) and ArchUnit or equivalent can 
be used to ensure it is limited to the scanner and classgraph is not used 
outside of that class to ensure it can still be dropped.

> PipelineResources algorithm is not working in most environments
> ---
>
> Key: BEAM-5495
> URL: https://issues.apache.org/jira/browse/BEAM-5495
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, runner-spark, sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Lukasz Gajowy
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 15h 50m
>  Remaining Estimate: 0h
>
> Issue are:
> 1. it assumes the classloader is an URLClassLoader (not always true and java 
> >= 9 breaks that as well for the app loader)
> 2. it uses loader.getURLs() which leads to including the JRE itself in the 
> staged file
> Looks like this detect resource algorithm can't work and should be replaced 
> by a SPI rather than a built-in and not extensible algorithm. Another valid 
> alternative is to just drop that "guess" logic and force the user to set 
> staged files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-5495) PipelineResources algorithm is not working in most environments

2019-12-17 Thread Romain Manni-Bucau (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16998451#comment-16998451
 ] 

Romain Manni-Bucau commented on BEAM-5495:
--

I always had to disable it cause it was wrong and slow. Env are mainly tests 
and livy.

> PipelineResources algorithm is not working in most environments
> ---
>
> Key: BEAM-5495
> URL: https://issues.apache.org/jira/browse/BEAM-5495
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, runner-spark, sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Lukasz Gajowy
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>
> Issue are:
> 1. it assumes the classloader is an URLClassLoader (not always true and java 
> >= 9 breaks that as well for the app loader)
> 2. it uses loader.getURLs() which leads to including the JRE itself in the 
> staged file
> Looks like this detect resource algorithm can't work and should be replaced 
> by a SPI rather than a built-in and not extensible algorithm. Another valid 
> alternative is to just drop that "guess" logic and force the user to set 
> staged files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-5495) PipelineResources algorithm is not working in most environments

2019-12-17 Thread Romain Manni-Bucau (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16998350#comment-16998350
 ] 

Romain Manni-Bucau commented on BEAM-5495:
--

Ok, really no blaming there, the rational is that xbean is not that impacting 
in most environment whereas classgraph will be a new dependency not yet managed 
and under security scan radars + the maintenance point.

That said, I think dropping that feature from the core and letting it be a 
plugin is likely the sanest default. Scanning the classpath can only be 
relevant if the user is able to configure the environment explicitly (like skip 
parent classloader for spark) or until the runner can become aware of it and 
contribute to it - none of both options are there. We can indeed enhance the 
SPI but it makes thing complicated and at the end this is not really a submit 
feature compared to the job api which should handle that. So looks like a leak 
in the sdk-core to me.

Any hope the feature is extracted in an extension or job api layer?

> PipelineResources algorithm is not working in most environments
> ---
>
> Key: BEAM-5495
> URL: https://issues.apache.org/jira/browse/BEAM-5495
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, runner-spark, sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Lukasz Gajowy
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>
> Issue are:
> 1. it assumes the classloader is an URLClassLoader (not always true and java 
> >= 9 breaks that as well for the app loader)
> 2. it uses loader.getURLs() which leads to including the JRE itself in the 
> staged file
> Looks like this detect resource algorithm can't work and should be replaced 
> by a SPI rather than a built-in and not extensible algorithm. Another valid 
> alternative is to just drop that "guess" logic and force the user to set 
> staged files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-5495) PipelineResources algorithm is not working in most environments

2019-12-17 Thread Romain Manni-Bucau (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16998235#comment-16998235
 ] 

Romain Manni-Bucau commented on BEAM-5495:
--

[~ŁukaszG] hehe, s/did not know/forgot/ ? Joke apart it is in the list you 
shared in the preview comment (was on the list). My main concern is to use a 
library which can die with a man versus giving guarantee of health for 
something in sdk-core. Side note: I am more than happy if this becomes a beam 
extension, this is always deactivated in real life anyway since the scanning is 
99% of the time wrong.

> PipelineResources algorithm is not working in most environments
> ---
>
> Key: BEAM-5495
> URL: https://issues.apache.org/jira/browse/BEAM-5495
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, runner-spark, sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Lukasz Gajowy
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>
> Issue are:
> 1. it assumes the classloader is an URLClassLoader (not always true and java 
> >= 9 breaks that as well for the app loader)
> 2. it uses loader.getURLs() which leads to including the JRE itself in the 
> staged file
> Looks like this detect resource algorithm can't work and should be replaced 
> by a SPI rather than a built-in and not extensible algorithm. Another valid 
> alternative is to just drop that "guess" logic and force the user to set 
> staged files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-5495) PipelineResources algorithm is not working in most environments

2019-12-17 Thread Romain Manni-Bucau (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16998139#comment-16998139
 ] 

Romain Manni-Bucau commented on BEAM-5495:
--

One of the prs fixing this issue was using apache xbean instead of classgraph.

Wonder if there is any rational to not priviledge apache and use a single man 
(so more risky for asf) lib.

> PipelineResources algorithm is not working in most environments
> ---
>
> Key: BEAM-5495
> URL: https://issues.apache.org/jira/browse/BEAM-5495
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, runner-spark, sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Lukasz Gajowy
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>
> Issue are:
> 1. it assumes the classloader is an URLClassLoader (not always true and java 
> >= 9 breaks that as well for the app loader)
> 2. it uses loader.getURLs() which leads to including the JRE itself in the 
> staged file
> Looks like this detect resource algorithm can't work and should be replaced 
> by a SPI rather than a built-in and not extensible algorithm. Another valid 
> alternative is to just drop that "guess" logic and force the user to set 
> staged files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-5495) PipelineResources algorithm is not working in most environments

2019-12-16 Thread Romain Manni-Bucau (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16997492#comment-16997492
 ] 

Romain Manni-Bucau commented on BEAM-5495:
--

Any reason to use a one man github project (io.github.classgraph:classgraph) 
instead of apache xbean proposal?

> PipelineResources algorithm is not working in most environments
> ---
>
> Key: BEAM-5495
> URL: https://issues.apache.org/jira/browse/BEAM-5495
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, runner-spark, sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Lukasz Gajowy
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>
> Issue are:
> 1. it assumes the classloader is an URLClassLoader (not always true and java 
> >= 9 breaks that as well for the app loader)
> 2. it uses loader.getURLs() which leads to including the JRE itself in the 
> staged file
> Looks like this detect resource algorithm can't work and should be replaced 
> by a SPI rather than a built-in and not extensible algorithm. Another valid 
> alternative is to just drop that "guess" logic and force the user to set 
> staged files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7881) Get rid of jackson to avoid the continuous flow of CVEs in Jackson

2019-12-05 Thread Romain Manni-Bucau (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989414#comment-16989414
 ] 

Romain Manni-Bucau commented on BEAM-7881:
--

I will just highlight that the 0day issue was due to the presence of jars, not 
their feature activation and that beam does not own jackson version but the 
runner does. So best beam can do is to decoralate itself from such libs IMHO.

Now if the community does not care, please just close the ticket, this is no 
more a blocker for me.

> Get rid of jackson to avoid the continuous flow of CVEs in Jackson
> --
>
> Key: BEAM-7881
> URL: https://issues.apache.org/jira/browse/BEAM-7881
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Affects Versions: 2.14.0
>Reporter: Romain Manni-Bucau
>Priority: Blocker
>
> Jackson keeps having CVE on all releases of databind and transitively beam 
> sdk java core has CVE on all its releases (for the record, when writing this 
> issue you must use at least jackson-databind 2.9.9.2 but last week it was 
> 2.9.9.1 and 2.14 didn't get the fix).
> Can be neat to get rid of jackson which does not fix this issue for a very 
> long time now and just use JSON-B or another JSON impl to ensure the CVE is 
> not usable because beam is there.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8647) Remove .mailmap from the sources

2019-11-13 Thread Romain Manni-Bucau (Jira)
Romain Manni-Bucau created BEAM-8647:


 Summary: Remove .mailmap from the sources
 Key: BEAM-8647
 URL: https://issues.apache.org/jira/browse/BEAM-8647
 Project: Beam
  Issue Type: Task
  Components: beam-community
Reporter: Romain Manni-Bucau
Assignee: Aizhamal Nurmamat kyzy


Hi,

 

.mailmap manipulates individuals data which are considered "personal" (name, 
email etc)

AFAIK Apache/Beam is not allowed to do it straight, in particular for EU 
citizens (_GDPR)._

Can the file be removed since it is not used by the beam project (at least 
apache/beam repo)?

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-6481) Relocations don't include sources

2019-10-23 Thread Romain Manni-Bucau (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957709#comment-16957709
 ] 

Romain Manni-Bucau commented on BEAM-6481:
--

Up?

> Relocations don't include sources
> -
>
> Key: BEAM-6481
> URL: https://issues.apache.org/jira/browse/BEAM-6481
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Romain Manni-Bucau
>Priority: Major
> Fix For: Not applicable
>
>
> Beam uses a lot relocations to shade/shadow libraries and try to avoid 
> conflicts,
> however the setup is not complete and the -sources.jar don't include the 
> sources of the shade making it less friendly to debug.
> Can you please fix that?
> Thanks,
> Romain



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8420) Beam Model sources (classifier) jar don't contain sources

2019-10-22 Thread Romain Manni-Bucau (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Manni-Bucau updated BEAM-8420:
-
Description: These packages are generated from proto files. The jars only 
contain META-INF/MANIFEST.MF. They should either not be distributed or should 
contain the generated java source files and optionally the proto files.  (was: 
These packages are generated from proto files. The jars only contain 
META-INF/MANIFEST.MF. They should either not be distributed or should contain 
the source proto files.)

> Beam Model sources (classifier) jar don't contain sources
> -
>
> Key: BEAM-8420
> URL: https://issues.apache.org/jira/browse/BEAM-8420
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Affects Versions: 2.16.0
>Reporter: Romain Manni-Bucau
>Priority: Major
>
> These packages are generated from proto files. The jars only contain 
> META-INF/MANIFEST.MF. They should either not be distributed or should contain 
> the generated java source files and optionally the proto files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8420) Beam Model sources (classifier) jar don't contain sources

2019-10-17 Thread Romain Manni-Bucau (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Manni-Bucau updated BEAM-8420:
-
Summary: Beam Model sources (classifier) jar don't contain sources  (was: 
Beam Model sources jar don't contain sources)

> Beam Model sources (classifier) jar don't contain sources
> -
>
> Key: BEAM-8420
> URL: https://issues.apache.org/jira/browse/BEAM-8420
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Affects Versions: 2.16.0
>Reporter: Romain Manni-Bucau
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8421) Job API relies on org.apache.beam.vendor.

2019-10-17 Thread Romain Manni-Bucau (Jira)
Romain Manni-Bucau created BEAM-8421:


 Summary: Job API relies on org.apache.beam.vendor.
 Key: BEAM-8421
 URL: https://issues.apache.org/jira/browse/BEAM-8421
 Project: Beam
  Issue Type: Bug
  Components: beam-model
Affects Versions: 2.16.0
Reporter: Romain Manni-Bucau






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8421) Job API relies on org.apache.beam.vendor.

2019-10-17 Thread Romain Manni-Bucau (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Manni-Bucau updated BEAM-8421:
-
Description: API shouldn't rely on any internal

> Job API relies on org.apache.beam.vendor.
> -
>
> Key: BEAM-8421
> URL: https://issues.apache.org/jira/browse/BEAM-8421
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Affects Versions: 2.16.0
>Reporter: Romain Manni-Bucau
>Priority: Major
>
> API shouldn't rely on any internal



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8420) Beam Model sources jar don't contain sources

2019-10-17 Thread Romain Manni-Bucau (Jira)
Romain Manni-Bucau created BEAM-8420:


 Summary: Beam Model sources jar don't contain sources
 Key: BEAM-8420
 URL: https://issues.apache.org/jira/browse/BEAM-8420
 Project: Beam
  Issue Type: Bug
  Components: build-system
Affects Versions: 2.16.0
Reporter: Romain Manni-Bucau






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7881) Get rid of jackson to avoid the continuous flow of CVEs in Jackson

2019-10-09 Thread Romain Manni-Bucau (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948204#comment-16948204
 ] 

Romain Manni-Bucau commented on BEAM-7881:
--

Well I dont really panic but I am a bkt tired of that issue.

You need to consider multiple points on that:
 # Jackson alone is in better shape and does require an explicit list of 
*classnames* if the feature is activated - note it is not only what was done
 # Projects cant review all usages each time an issue is found so it is 
expected to be CVE free anyway
 # Jackson still enables to exploits the issue by its too user friendly config
 # Beam must also ensure there is no issue in all possible usable runner stacks
 # Most of beam code can be exploited from an endpoint or external system by 
design even if indirected (all is not just cronned ;)) 
 # Beam is often coupled with other libs which can exploit that so not having 
it is more drastic but efficient and saves investigations for each release 
which is very costly for end users for literally no gain

 

> Get rid of jackson to avoid the continuous flow of CVEs in Jackson
> --
>
> Key: BEAM-7881
> URL: https://issues.apache.org/jira/browse/BEAM-7881
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Affects Versions: 2.14.0
>Reporter: Romain Manni-Bucau
>Priority: Blocker
>
> Jackson keeps having CVE on all releases of databind and transitively beam 
> sdk java core has CVE on all its releases (for the record, when writing this 
> issue you must use at least jackson-databind 2.9.9.2 but last week it was 
> 2.9.9.1 and 2.14 didn't get the fix).
> Can be neat to get rid of jackson which does not fix this issue for a very 
> long time now and just use JSON-B or another JSON impl to ensure the CVE is 
> not usable because beam is there.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7881) Get rid of jackson to avoid the continuous flow of CVEs in Jackson

2019-09-20 Thread Romain Manni-Bucau (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934747#comment-16934747
 ] 

Romain Manni-Bucau commented on BEAM-7881:
--

Up, the lack of careness of security by jackson is a real concern which should 
be addressed IMHO.

Any hope to get it fixed soon?

> Get rid of jackson to avoid the continuous flow of CVEs in Jackson
> --
>
> Key: BEAM-7881
> URL: https://issues.apache.org/jira/browse/BEAM-7881
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Affects Versions: 2.14.0
>Reporter: Romain Manni-Bucau
>Priority: Blocker
>
> Jackson keeps having CVE on all releases of databind and transitively beam 
> sdk java core has CVE on all its releases (for the record, when writing this 
> issue you must use at least jackson-databind 2.9.9.2 but last week it was 
> 2.9.9.1 and 2.14 didn't get the fix).
> Can be neat to get rid of jackson which does not fix this issue for a very 
> long time now and just use JSON-B or another JSON impl to ensure the CVE is 
> not usable because beam is there.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8128) Don't deprecate Read for Impulse

2019-09-02 Thread Romain Manni-Bucau (Jira)
Romain Manni-Bucau created BEAM-8128:


 Summary: Don't deprecate Read for Impulse
 Key: BEAM-8128
 URL: https://issues.apache.org/jira/browse/BEAM-8128
 Project: Beam
  Issue Type: Bug
  Components: runner-core, sdk-java-core
Affects Versions: 2.15.0
Reporter: Romain Manni-Bucau


In last beam release, Read.Bounded and Read.Unbounded are deprecated and beam 
tends to move to Impulse usage.

This is a huge breaking change since users can't rely on custom pre-runner 
pipeline visitor to instrument their pipelines anymore or even identify the 
transform accurately anymore.

This issue is about ensure that SDF or not Read.Bounded and Read.Unbounded is a 
stable transform matcher and can still be used by user code to identify inputs.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7891) Vendoring packaging is still buggy

2019-08-05 Thread Romain Manni-Bucau (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Manni-Bucau updated BEAM-7891:
-
Description: 
In 2.14 the overlapping bug between modules is still not fixed, it still 
prevents to use beam with some JVM, pollutes a lot shadowing/uber jar creation 
and can prevent beam to run under some classloading setup (potentielly in an 
engine/runner). Here is one example:

 
{code:java}
[INFO] [WARNING] beam-vendor-grpc-1_13_1-0.2.jar, 
beam-vendor-sdks-java-extensions-protobuf-2.14.0.jar define 1814 overlapping 
classes:
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.collect.ImmutableMapValues$1
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.util.concurrent.ImmediateFuture$ImmediateCancelledFuture
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.base.Converter$ReverseConverter
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.hash.HashCode$IntHashCode
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.collect.Iterables$8$1
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.collect.HashBiMap
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.cache.CacheBuilderSpec$WriteDurationParser
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.collect.Multiset$Entry
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.graph.AbstractValueGraph
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.util.concurrent.InterruptibleTask{code}
This task is indeed about fixing the overlappings but also ensuring it can't 
come in 2.15 since all versions are affected since vendoring had been set up 
and it never had been cleanly fixed on all the build.

 

Thanks

  was:
In 2.14 the overlapping bug between modules is still not fixed, it still 
prevents to use beam with some JVMs and pollutes a lot shadowing/uber jar 
creation. Here is one example:

 
{code:java}
[INFO] [WARNING] beam-vendor-grpc-1_13_1-0.2.jar, 
beam-vendor-sdks-java-extensions-protobuf-2.14.0.jar define 1814 overlapping 
classes:
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.collect.ImmutableMapValues$1
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.util.concurrent.ImmediateFuture$ImmediateCancelledFuture
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.base.Converter$ReverseConverter
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.hash.HashCode$IntHashCode
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.collect.Iterables$8$1
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.collect.HashBiMap
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.cache.CacheBuilderSpec$WriteDurationParser
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.collect.Multiset$Entry
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.graph.AbstractValueGraph
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.util.concurrent.InterruptibleTask{code}
This task is indeed about fixing the overlappings but also ensuring it can't 
come in 2.15 since all versions are affected since vendoring had been set up 
and it never had been cleanly fixed on all the build.

 

Thanks


> Vendoring packaging is still buggy
> --
>
> Key: BEAM-7891
> URL: https://issues.apache.org/jira/browse/BEAM-7891
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Romain Manni-Bucau
>Priority: Blocker
> Fix For: 2.14.0
>
>
> In 2.14 the overlapping bug between modules is still not fixed, it still 
> prevents to use beam with some JVM, pollutes a lot shadowing/uber jar 
> creation and can prevent beam to run under some classloading setup 
> (potentielly in an engine/runner). Here is one example:
>  
> {code:java}
> [INFO] [WARNING] beam-vendor-grpc-1_13_1-0.2.jar, 
> beam-vendor-sdks-java-extensions-protobuf-2.14.0.jar define 1814 overlapping 
> classes:
> [INFO] [WARNING] - 
> org.apache.beam.vendor.grpc.v1p13p1.com.google.common.collect.ImmutableMapValues$1
> [INFO] [WARNING] - 
> org.apache.beam.vendor.grpc.v1p13p1.com.google.common.util.concurrent.ImmediateFuture$ImmediateCancelledFuture
> [INFO] [WARNING] - 
> org.apache.beam.vendor.grpc.v1p13p1.com.google.common.base.Converter$ReverseConverter
> [INFO] [WARNING] - 
> org.apache.beam.vendor.grpc.v1p13p1.com.google.common.hash.HashCode$IntHashCode
> [INFO] [WARNING] - 
> org.apache.beam.vendor.grpc.v1p13p1.com.google.common.collect.Iterables$8$1
> [INFO] [WARNING] - 
> 

[jira] [Created] (BEAM-7891) Vendoring packaging is still buggy

2019-08-05 Thread Romain Manni-Bucau (JIRA)
Romain Manni-Bucau created BEAM-7891:


 Summary: Vendoring packaging is still buggy
 Key: BEAM-7891
 URL: https://issues.apache.org/jira/browse/BEAM-7891
 Project: Beam
  Issue Type: Bug
  Components: build-system
Reporter: Romain Manni-Bucau
 Fix For: 2.14.0


In 2.14 the overlapping bug between modules is still not fixed, it still 
prevents to use beam with some JVMs and pollutes a lot shadowing/uber jar 
creation. Here is one example:

 
{code:java}
[INFO] [WARNING] beam-vendor-grpc-1_13_1-0.2.jar, 
beam-vendor-sdks-java-extensions-protobuf-2.14.0.jar define 1814 overlapping 
classes:
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.collect.ImmutableMapValues$1
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.util.concurrent.ImmediateFuture$ImmediateCancelledFuture
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.base.Converter$ReverseConverter
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.hash.HashCode$IntHashCode
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.collect.Iterables$8$1
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.collect.HashBiMap
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.cache.CacheBuilderSpec$WriteDurationParser
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.collect.Multiset$Entry
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.graph.AbstractValueGraph
[INFO] [WARNING] - 
org.apache.beam.vendor.grpc.v1p13p1.com.google.common.util.concurrent.InterruptibleTask{code}
This task is indeed about fixing the overlappings but also ensuring it can't 
come in 2.15 since all versions are affected since vendoring had been set up 
and it never had been cleanly fixed on all the build.

 

Thanks



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (BEAM-7890) Rework dependency stack to ensure beam stay lightweight + embeddable

2019-08-05 Thread Romain Manni-Bucau (JIRA)
Romain Manni-Bucau created BEAM-7890:


 Summary: Rework dependency stack to ensure beam stay lightweight + 
embeddable
 Key: BEAM-7890
 URL: https://issues.apache.org/jira/browse/BEAM-7890
 Project: Beam
  Issue Type: Bug
  Components: runner-core, runner-spark, sdk-java-core
Affects Versions: 2.14.0
Reporter: Romain Manni-Bucau


Currently, beam entry cost is > 30M:

 
{code:java}
-rw-r--r-- 1 rmannibucau rmannibucau  13M févr. 17 11:45 
beam-vendor-grpc-1_13_1-0.2.jar
-rw-r--r-- 1 rmannibucau rmannibucau 8,7M août   5 10:22 
beam-sdks-java-core-2.14.0.jar
-rw-r--r-- 1 rmannibucau rmannibucau 2,6M août   5 10:25 
beam-vendor-sdks-java-extensions-protobuf-2.14.0.jar
-rw-r--r-- 1 rmannibucau rmannibucau 2,6M févr. 17 11:45 
beam-vendor-guava-20_0-0.1.jar
-rw-r--r-- 1 rmannibucau rmannibucau 1,4M août   5 10:21 
beam-model-pipeline-2.14.0.jar
-rw-r--r-- 1 rmannibucau rmannibucau 825K août   5 10:25 
beam-model-fn-execution-2.14.0.jar
-rw-r--r-- 1 rmannibucau rmannibucau 470K août   5 10:21 
beam-model-job-management-2.14.0.jar
-rw-r--r-- 1 rmannibucau rmannibucau 446K août   5 10:25 
beam-runners-core-construction-java-2.14.0.jar
-rw-r--r-- 1 rmannibucau rmannibucau 378K août   5 10:24 
beam-runners-core-java-2.14.0.jar{code}
Due to its embed nature (generally sent with the job) it should stay as 
lightweight as possible. I see a few actions which can help to make back beam 
integrable:

 
 # Make all the polyglotism layer optional and excludable, this is never needed 
for several jobs and this additional weight is a clear regression on the 
packaging side of beam,
 # Vendoring and sdk dependencies are generally luxuray (who needs a library to 
do a new ArrayList<>() in 2019 ;)) so most of the dependencies can be dropped, 
vendoring can be made very lightweight - to not say optional for the sdk java 
core

At the end a reasonable limit for a runner like spark - not the direct one 
which reimplements all the logic by design - would be around 5M of deps IMHO.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (BEAM-7881) [CVE] Get rid of jackson or ensure it has no CVE

2019-08-02 Thread Romain Manni-Bucau (JIRA)
Romain Manni-Bucau created BEAM-7881:


 Summary: [CVE] Get rid of jackson or ensure it has no CVE
 Key: BEAM-7881
 URL: https://issues.apache.org/jira/browse/BEAM-7881
 Project: Beam
  Issue Type: Task
  Components: sdk-java-core
Affects Versions: 2.14.0
Reporter: Romain Manni-Bucau


Jackson keeps having CVE on all releases of databind and transitively beam sdk 
java core has CVE on all its releases (for the record, when writing this issue 
you must use at least jackson-databind 2.9.9.2 but last week it was 2.9.9.1 and 
2.14 didn't get the fix).

Can be neat to get rid of jackson which does not fix this issue for a very long 
time now and just use JSON-B or another JSON impl to ensure the CVE is not 
usable because beam is there.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (BEAM-7328) Update Avro to version 1.9.0 in Java SDK

2019-07-09 Thread Romain Manni-Bucau (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881442#comment-16881442
 ] 

Romain Manni-Bucau commented on BEAM-7328:
--

Hi,

Is it possible to drop avro from sdk-java-core before that - without package 
overlap to keep java >= 8 compatiblity? otherwise upgrading avro in beam means 
for users to not upgrade beam since both avro versions are not compatible and 
all the industry relies on avro 1.8. Typically 
https://lists.apache.org/thread.html/fa3508957fddf19b9ec1546cb1279642f52b807dd5f161e674bdc782@%3Cdev.avro.apache.org%3E
 misses S3, Kafka, several metadata reposities solutions and other big data 
storages which means upgrading all the ecosystem within less than 5 years 
sounds likely not realistic, in particular when it comes to cloud products and 
existing storages. However, upgrading beam must be possible (at least for CVE) 
so not sure there are much option except doing a release of beam with avro 
being an extension/io and no more in core codepath, then only add an avro19 
module.

Side note: I know beam loves relocations but it does not work for avro so it is 
not an potential workaround.

Romain

> Update Avro to version 1.9.0 in Java SDK
> 
>
> Key: BEAM-7328
> URL: https://issues.apache.org/jira/browse/BEAM-7328
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>
> Avro 1.9.0 has nice improvements like a reduced size (1MB less) , multiple 
> dependencies are not needed anymore (Guava, paranamer, etc) as well as 
> cleanups in its APIs to not expose and be tight to Jackson so a worth upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7556) Enable to upgrade proxy generation independently of beam for java support

2019-06-14 Thread Romain Manni-Bucau (JIRA)
Romain Manni-Bucau created BEAM-7556:


 Summary: Enable to upgrade proxy generation independently of beam 
for java support
 Key: BEAM-7556
 URL: https://issues.apache.org/jira/browse/BEAM-7556
 Project: Beam
  Issue Type: Task
  Components: sdk-java-core
Affects Versions: 2.13.0
Reporter: Romain Manni-Bucau


Beam is now using a custom shaded version of bytebudy which makes impossible - 
until you reshade - to upgrade bytebuddy without requiring a new beam release.

However with the fast release rate of the JVM it is important to be able to 
upgrade bytebuddy - at least while beam is using it which is technically not a 
strong requirement - to enable to run on the new JVM.

For example, last beam release does not support recent java:

{code}
Caused by: java.lang.UnsupportedOperationException: Cannot define class using 
reflection: Cannot define nest member class 
java.lang.reflect.AccessibleObject$Cache + within different package then class 
org.apache.beam.repackaged.beam_sdks_java_core.net.bytebuddy.mirror.AccessibleObject
{code}

My preference to fix this issue would be to relax the proxying definition to 
just use a "proxy classloader" where the proxy would be defined but it requires 
to be able to attach it to an execution - where beam is not yet super clean.
Alternative is to have a SPI for the asm usage and enable to user to replace 
the bytebuddy impl with either a not shaded version or even a pure asm one to 
let him control the dependencies.

Romain



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7042) [regression] org.apache.beam:beam-sdks-java-core:jar:2.12.0 dependencies are way fatter and conflicting than before

2019-04-09 Thread Romain Manni-Bucau (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814034#comment-16814034
 ] 

Romain Manni-Bucau commented on BEAM-7042:
--

[~reuvenlax] as mentionned it is a regression so a blocker. It was NOT the case 
in 2.10 and 2.11.

> [regression] org.apache.beam:beam-sdks-java-core:jar:2.12.0 dependencies are 
> way fatter and conflicting than before
> ---
>
> Key: BEAM-7042
> URL: https://issues.apache.org/jira/browse/BEAM-7042
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Reuven Lax
>Priority: Major
> Fix For: 2.13.0
>
>
> Hi guys,
> seems current sdk brings antlr and its transitive deps which is quite 
> unlikely and has a huge probability to conflict (antlt, jsonp in an outdated 
> version at least for the one breaking my apps)
> can it be cleaned up for the 2.12 to avoid to break application please?
> Thanks,
> Romain



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7042) [regression] org.apache.beam:beam-sdks-java-core:jar:2.12.0 dependencies are way fatter and conflicting than before

2019-04-09 Thread Romain Manni-Bucau (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813806#comment-16813806
 ] 

Romain Manni-Bucau commented on BEAM-7042:
--

[~reuvenlax] no, dont mix consumer and producer. Antlr was not added to poms so 
consumers were safe. This is no more true. Guess relocation/vendoring work 
broke something?

> [regression] org.apache.beam:beam-sdks-java-core:jar:2.12.0 dependencies are 
> way fatter and conflicting than before
> ---
>
> Key: BEAM-7042
> URL: https://issues.apache.org/jira/browse/BEAM-7042
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Romain Manni-Bucau
>Priority: Major
> Fix For: 2.12.0
>
>
> Hi guys,
> seems current sdk brings antlr and its transitive deps which is quite 
> unlikely and has a huge probability to conflict (antlt, jsonp in an outdated 
> version at least for the one breaking my apps)
> can it be cleaned up for the 2.12 to avoid to break application please?
> Thanks,
> Romain



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7042) [regression] org.apache.beam:beam-sdks-java-core:jar:2.12.0 dependencies are way fatter and conflicting than before

2019-04-09 Thread Romain Manni-Bucau (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813628#comment-16813628
 ] 

Romain Manni-Bucau commented on BEAM-7042:
--

Another good and saner option is IMHO to not mess up sdk core with schema and 
make it an extension, there is no blocker to do it. Also note that 
ArgumentProvider would benefit becoming a SPI and enable to unify injectable 
types and even would be a good user feature IMHO. These architecture fixes 
would solve cleanly that issue and precent to come back yet another time.

> [regression] org.apache.beam:beam-sdks-java-core:jar:2.12.0 dependencies are 
> way fatter and conflicting than before
> ---
>
> Key: BEAM-7042
> URL: https://issues.apache.org/jira/browse/BEAM-7042
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Romain Manni-Bucau
>Priority: Major
> Fix For: 2.12.0
>
>
> Hi guys,
> seems current sdk brings antlr and its transitive deps which is quite 
> unlikely and has a huge probability to conflict (antlt, jsonp in an outdated 
> version at least for the one breaking my apps)
> can it be cleaned up for the 2.12 to avoid to break application please?
> Thanks,
> Romain



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7042) [regression] org.apache.beam:beam-sdks-java-core:jar:2.12.0 dependencies are way fatter and conflicting than before

2019-04-09 Thread Romain Manni-Bucau (JIRA)
Romain Manni-Bucau created BEAM-7042:


 Summary: [regression] 
org.apache.beam:beam-sdks-java-core:jar:2.12.0 dependencies are way fatter and 
conflicting than before
 Key: BEAM-7042
 URL: https://issues.apache.org/jira/browse/BEAM-7042
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Romain Manni-Bucau
 Fix For: 2.12.0


Hi guys,

seems current sdk brings antlr and its transitive deps which is quite unlikely 
and has a huge probability to conflict (antlt, jsonp in an outdated version at 
least for the one breaking my apps)

can it be cleaned up for the 2.12 to avoid to break application please?

Thanks,
Romain



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6519) java extension must not use org.apache.beam.sdk.util

2019-04-03 Thread Romain Manni-Bucau (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809504#comment-16809504
 ] 

Romain Manni-Bucau commented on BEAM-6519:
--

Hi [~lcwik], seems it fixes it yes. Thanks!

> java extension must not use org.apache.beam.sdk.util
> 
>
> Key: BEAM-6519
> URL: https://issues.apache.org/jira/browse/BEAM-6519
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Affects Versions: 2.9.0
>Reporter: Romain Manni-Bucau
>Assignee: Luke Cwik
>Priority: Blocker
>  Labels: triaged
>
> Some shades of beam reuse sdk java core packages (the util one in particular)
> This is preventing to use beam in several environments (OSGi, classloader 
> isolation, java 11/jpms etc), please fix it asap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6519) java extension must not use org.apache.beam.sdk.util

2019-03-26 Thread Romain Manni-Bucau (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802019#comment-16802019
 ] 

Romain Manni-Bucau commented on BEAM-6519:
--

Beam does not run from the mentionned environments so guess so

> java extension must not use org.apache.beam.sdk.util
> 
>
> Key: BEAM-6519
> URL: https://issues.apache.org/jira/browse/BEAM-6519
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Affects Versions: 2.9.0
>Reporter: Romain Manni-Bucau
>Assignee: Romain Manni-Bucau
>Priority: Blocker
>  Labels: triaged
> Fix For: 2.12.0
>
>
> Some shades of beam reuse sdk java core packages (the util one in particular)
> This is preventing to use beam in several environments (OSGi, classloader 
> isolation, java 11/jpms etc), please fix it asap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6808) Use gav or something equivalent in announcement for dependency uogrades

2019-03-11 Thread Romain Manni-Bucau (JIRA)
Romain Manni-Bucau created BEAM-6808:


 Summary: Use gav or something equivalent in announcement for 
dependency uogrades
 Key: BEAM-6808
 URL: https://issues.apache.org/jira/browse/BEAM-6808
 Project: Beam
  Issue Type: Improvement
  Components: build-system
Affects Versions: 2.11.0
Reporter: Romain Manni-Bucau


Annoucement/changelog uses gradle variables which is not very user friendly 
since it is beam internals. Would be great to move to actual gav.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6770) Correct zstd-jni dependency scope to optional

2019-03-06 Thread Romain Manni-Bucau (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785805#comment-16785805
 ] 

Romain Manni-Bucau commented on BEAM-6770:
--

with gradle you can always create the configuration "optional" and add it to 
test configuration in the build script. It is already done in some places in 
beam for other concerns (IIRC it was to add validates runner in test scope in 
runner modules).

> Correct zstd-jni dependency scope to optional
> -
>
> Key: BEAM-6770
> URL: https://issues.apache.org/jira/browse/BEAM-6770
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Romain Manni-Bucau
>Priority: Major
>
> Beam 2.11.0 introduced a new transitive dep aka zstd-jni. AFAIK it is not 
> needed in most cases so shouldn't be here by default. Also saw it was 
> configured as shadow in the sdk core java module so not sure it is a gradle 
> build bug or intended to be like that but I think sdk-core-java should be 
> cleaned up cause it is now very fast and does not match a lot of usage. 
> Finally this lib being native it is not that sane to bring it by default, in 
> particular with the dockerization happening right now and the goal to have a 
> light container stack (which often implies to not use standard linux as FROM).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6770) Cleanup dependencies

2019-03-06 Thread Romain Manni-Bucau (JIRA)
Romain Manni-Bucau created BEAM-6770:


 Summary: Cleanup dependencies
 Key: BEAM-6770
 URL: https://issues.apache.org/jira/browse/BEAM-6770
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Romain Manni-Bucau


Beam 2.11.0 introduced a new transitive dep aka zstd-jni. AFAIK it is not 
needed in most cases so shouldn't be here by default. Also saw it was 
configured as shadow in the sdk core java module so not sure it is a gradle 
build bug or intended to be like that but I think sdk-core-java should be 
cleaned up cause it is now very fast and does not match a lot of usage. Finally 
this lib being native it is not that sane to bring it by default, in particular 
with the dockerization happening right now and the goal to have a light 
container stack (which often implies to not use standard linux as FROM).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5495) PipelineResources algorithm is not working in most environments

2019-02-27 Thread Romain Manni-Bucau (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779293#comment-16779293
 ] 

Romain Manni-Bucau commented on BEAM-5495:
--

[~mxm] no no, feel free to grab the code directly

> PipelineResources algorithm is not working in most environments
> ---
>
> Key: BEAM-5495
> URL: https://issues.apache.org/jira/browse/BEAM-5495
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, runner-spark, sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Amit Sela
>Priority: Major
>  Labels: triaged
>
> Issue are:
> 1. it assumes the classloader is an URLClassLoader (not always true and java 
> >= 9 breaks that as well for the app loader)
> 2. it uses loader.getURLs() which leads to including the JRE itself in the 
> staged file
> Looks like this detect resource algorithm can't work and should be replaced 
> by a SPI rather than a built-in and not extensible algorithm. Another valid 
> alternative is to just drop that "guess" logic and force the user to set 
> staged files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6656) org.apache.beam.runners.core.construction.PipelineResources#detectClassPathResourcesToStage broken on java > 8

2019-02-12 Thread Romain Manni-Bucau (JIRA)
Romain Manni-Bucau created BEAM-6656:


 Summary: 
org.apache.beam.runners.core.construction.PipelineResources#detectClassPathResourcesToStage
 broken on java > 8
 Key: BEAM-6656
 URL: https://issues.apache.org/jira/browse/BEAM-6656
 Project: Beam
  Issue Type: Improvement
  Components: runner-core
Reporter: Romain Manni-Bucau


As mentionned in a PR 
org.apache.beam.runners.core.construction.PipelineResources#detectClassPathResourcesToStage
 does not work with recent JVM since app loader is not an URLClassLoader. You 
can use the solution I mentionned at that time - see 
https://github.com/apache/beam/pull/4514.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6519) java extension must not use org.apache.beam.sdk.util

2019-01-31 Thread Romain Manni-Bucau (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16758005#comment-16758005
 ] 

Romain Manni-Bucau commented on BEAM-6519:
--

Well, beam project setup doesnt respect very common java basis of modularity 
and gradle setup with vendoring made it even worse. Also there is no real test 
except for flat classpath launches, this is why it occurred and disnt in most 
other asf projects. Fix that and no automotion will be needed, build is already 
very slow so if you add a global project check like that it will be even worse 
for contributors who already need to go through a huge step to fix a typo in a 
javadoc. If not set in default build it will be missed. This is why i think it 
is better to respect the minimal rule of modularity in java rather than looking 
for an automotion adding to the customizations and specificities beam has in 
its build for no dev gain.

> java extension must not use org.apache.beam.sdk.util
> 
>
> Key: BEAM-6519
> URL: https://issues.apache.org/jira/browse/BEAM-6519
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Affects Versions: 2.9.0
>Reporter: Romain Manni-Bucau
>Assignee: Romain Manni-Bucau
>Priority: Blocker
> Fix For: 2.11.0
>
>
> Some shades of beam reuse sdk java core packages (the util one in particular)
> This is preventing to use beam in several environments (OSGi, classloader 
> isolation, java 11/jpms etc), please fix it asap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6519) java extension must not use org.apache.beam.sdk.util

2019-01-28 Thread Romain Manni-Bucau (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754622#comment-16754622
 ] 

Romain Manni-Bucau commented on BEAM-6519:
--

Well enforcing it doesnt require any tool when you think about it. The most 
used practise is one package per module and then tools are useless since people 
will add classes in the existing root module package.


About the blocking side: as mentionned it prevents to use beam in multiple 
nvironments and it is a regression due to the shading strategy which is recent.

I have no issue if you do the 2.10 without a fix but foe the user it is a 
blocker anyway ;).

> java extension must not use org.apache.beam.sdk.util
> 
>
> Key: BEAM-6519
> URL: https://issues.apache.org/jira/browse/BEAM-6519
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Affects Versions: 2.9.0
>Reporter: Romain Manni-Bucau
>Assignee: Romain Manni-Bucau
>Priority: Blocker
> Fix For: 2.10.0
>
>
> Some shades of beam reuse sdk java core packages (the util one in particular)
> This is preventing to use beam in several environments (OSGi, classloader 
> isolation, java 11/jpms etc), please fix it asap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6519) java extension must not use org.apache.beam.sdk.util

2019-01-28 Thread Romain Manni-Bucau (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754251#comment-16754251
 ] 

Romain Manni-Bucau commented on BEAM-6519:
--

[~kenn] yes, I hit that with the gcp extension vendor jar but I suspect it can 
happen for others as well.

> java extension must not use org.apache.beam.sdk.util
> 
>
> Key: BEAM-6519
> URL: https://issues.apache.org/jira/browse/BEAM-6519
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Affects Versions: 2.9.0
>Reporter: Romain Manni-Bucau
>Assignee: Romain Manni-Bucau
>Priority: Blocker
> Fix For: 2.10.0
>
>
> Some shades of beam reuse sdk java core packages (the util one in particular)
> This is preventing to use beam in several environments (OSGi, classloader 
> isolation, java 11/jpms etc), please fix it asap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6519) java extension must not use org.apache.beam.sdk.util

2019-01-28 Thread Romain Manni-Bucau (JIRA)
Romain Manni-Bucau created BEAM-6519:


 Summary: java extension must not use org.apache.beam.sdk.util
 Key: BEAM-6519
 URL: https://issues.apache.org/jira/browse/BEAM-6519
 Project: Beam
  Issue Type: Bug
  Components: build-system
Affects Versions: 2.9.0
Reporter: Romain Manni-Bucau
Assignee: Luke Cwik


Some shades of beam reuse sdk java core packages (the util one in particular)
This is preventing to use beam in several environments (OSGi, classloader 
isolation, java 11/jpms etc), please fix it asap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6480) Provide an avro sink for IndexedRecord without a formatter

2019-01-23 Thread Romain Manni-Bucau (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749829#comment-16749829
 ] 

Romain Manni-Bucau commented on BEAM-6480:
--

[~iemejia] would work for me

> Provide an avro sink for IndexedRecord without a formatter
> --
>
> Key: BEAM-6480
> URL: https://issues.apache.org/jira/browse/BEAM-6480
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Affects Versions: 2.9.0
>Reporter: Romain Manni-Bucau
>Priority: Major
>
> More generally for sink there is no need to create a mapper API since the 
> previous PTransform can always map in a format the sink support so any sink 
> can assume the format is right
> This can imply to deprecate org.apache.beam.sdk.io.AvroIO.RecordFormatter



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6481) Relocations don't include sources

2019-01-22 Thread Romain Manni-Bucau (JIRA)
Romain Manni-Bucau created BEAM-6481:


 Summary: Relocations don't include sources
 Key: BEAM-6481
 URL: https://issues.apache.org/jira/browse/BEAM-6481
 Project: Beam
  Issue Type: Bug
  Components: build-system
Reporter: Romain Manni-Bucau
Assignee: Luke Cwik


Beam uses a lot relocations to shade/shadow libraries and try to avoid 
conflicts,
however the setup is not complete and the -sources.jar don't include the 
sources of the shade making it less friendly to debug.

Can you please fix that?

Thanks,
Romain



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6480) Provide an avro sink for IndexedRecord without a formatter

2019-01-22 Thread Romain Manni-Bucau (JIRA)
Romain Manni-Bucau created BEAM-6480:


 Summary: Provide an avro sink for IndexedRecord without a formatter
 Key: BEAM-6480
 URL: https://issues.apache.org/jira/browse/BEAM-6480
 Project: Beam
  Issue Type: Task
  Components: sdk-java-core
Affects Versions: 2.9.0
Reporter: Romain Manni-Bucau
Assignee: Kenneth Knowles


More generally for sink there is no need to create a mapper API since the 
previous PTransform can always map in a format the sink support so any sink can 
assume the format is right

This can imply to deprecate org.apache.beam.sdk.io.AvroIO.RecordFormatter



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6479) org.apache.beam.sdk.io.AvroIO.RecordFormatter should use IndexedRecord and not GenericRecord

2019-01-22 Thread Romain Manni-Bucau (JIRA)
Romain Manni-Bucau created BEAM-6479:


 Summary: org.apache.beam.sdk.io.AvroIO.RecordFormatter should use 
IndexedRecord and not GenericRecord
 Key: BEAM-6479
 URL: https://issues.apache.org/jira/browse/BEAM-6479
 Project: Beam
  Issue Type: Task
  Components: sdk-java-core
Affects Versions: 2.9.0
Reporter: Romain Manni-Bucau
Assignee: Kenneth Knowles


GenericRecord being an indexedRecord and the schema being provided this can 
only makes the supported use cases bigger



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)