[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581391#comment-16581391
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1157
  
Closing as merged


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581392#comment-16581392
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user mmiklavc closed the pull request at:

https://github.com/apache/metron/pull/1157


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581344#comment-16581344
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1157
  
Awesome! Thanks for the review and smoke test @nickwallen and @merrimanr. I 
am going to go ahead and merge this.


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581321#comment-16581321
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/1157
  
+1 I ran this up successfully.  Validated...
- [x] Alerts visible in the UI
- [x] Metron Service Check
- [x] Capture PCAP in HDFS 
- [x] Read PCAP from HDFS using CLI. 
- [x] Able to open resulting pcap file with `tshark -r `.
- [x] Read PCAP from HDFS using PCAP UI
- [x] Download PCAP from UI and open in Wireshark GUI.

![screen shot 2018-08-15 at 12 29 47 
pm](https://user-images.githubusercontent.com/2475409/44160249-d5a21b00-a087-11e8-94b7-b5fd1daec8d9.png)



> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581314#comment-16581314
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1157
  
Please disregard.  I failed to deploy the Ambari changes correctly.


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581308#comment-16581308
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1157
  
I ran a quick test in REST and it looks like the status never gets to 
`SUCCEEDED`.

Here is my request:
```
curl -X POST --header 'Content-Type: application/json' --header 'Accept: 
application/json' -d '{}' 'http://node1:8082/api/v1/pcap/fixed'
```

After the job finishes (looking at the RM UI), the status is:
```
{
  "jobId": "job_1533831319048_0046",
  "jobStatus": "FINALIZING",
  "description": "Finalizing job.",
  "percentComplete": 75,
  "pageTotal": 0
}
```
If I keep requesting status it always returns this.


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580247#comment-16580247
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1157
  
Note - I also added a small blurb about pcap page size to the README.md 
alongside the notes on setting the finalizer threads. This was missed 
previously.


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580241#comment-16580241
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1157#discussion_r210059160
  
--- Diff: 
metron-platform/metron-pcap/src/main/java/org/apache/metron/pcap/mr/PcapJob.java
 ---
@@ -307,8 +307,11 @@ public void setCompleteCheckInterval(long interval) {
   }
   return this;
 }
-mrJob.submit();
-jobStatus.withState(State.SUBMITTED).withDescription("Job 
submitted").withJobId(mrJob.getJobID().toString());
+synchronized (this) {
--- End diff --

fyi, turns out I was right first time around. Synchronization is necessary 
for visibility in the timer thread that is started after these modifications. 
I've updated the comments in code to describe this.


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580238#comment-16580238
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1157
  
**Testing**

Test plan pulled from here - 
https://github.com/apache/metron/pull/1081#issuecomment-400556832

Get PCAP data into Metron: 
1. Install and setup pycapa (this has been updated in master recently) - 
https://github.com/apache/metron/blob/master/metron-sensors/pycapa/README.md#centos-6
2. (if using singlenode vagrant) Kill the enrichment, profiler, indexing, 
and sensor topologies via `for i in bro enrichment random_access_indexing 
batch_indexing yaf snort;do storm kill $i;done`
3. Start the pcap topology via $METRON_HOME/bin/start_pcap_topology.sh
4. Start the pycapa packet capture producer on eth1 via /usr/bin/pycapa 
--producer --topic pcap -i eth1 -k node1:6667
5. Watch the topology in the Storm UI and kill the packet capture utility 
from before, when the number of packets ingested is over 3k.
6. Ensure that at at least 3 files exist on HDFS by running hadoop fs -ls 
/apps/metron/pcap
7. Choose a file (denoted by $FILE) and dump a few of the contents using 
the pcap_inspector utility via $METRON_HOME//bin/pcap_inspector.sh -i $FILE -n 5
8. Choose one of the lines and note the protocol.
9. Note that when you run the commands below, the resulting file will be 
placed in the execution directory where you kicked off the job from.

### Fixed filter

1. Run a fixed filter query by executing the following command with the 
values noted above (match your start_time format to the date format provided - 
default is to use millis since epoch)
2. `$METRON_HOME/bin/pcap_query.sh fixed -st  -df "MMdd" -p 
 -rpf 500`
3. Verify the MR job finishes successfully. Upon completion, you should see 
multiple files named with relatively current datestamps in your current 
directory, e.g. pcap-data-20160617160549737+.pcap
4. Copy the files to your local machine and verify you can them it in 
Wireshark. I chose a middle file and the last file. The middle file should have 
500 records (per the records_per_file option), and the last one will likely 
have a number of records <= 500.

### Query filter

1. Run a Stellar query filter query by executing a command similar to the 
following, with the values noted above (match your start_time format to the 
date format provided - default is to use millis since epoch)
2. `$METRON_HOME/bin/pcap_query.sh query -st "20160617" -df "MMdd" 
-query "protocol == '6'"  -rpf 500`
3. Verify the MR job finishes successfully. Upon completion, you should see 
multiple files named with relatively current datestamps in your current 
directory, e.g. pcap-data-20160617160549737+.pcap
4. Copy the files to your local machine and verify you can them it in 
Wireshark. I chose a middle file and the last file. The middle file should have 
500 records (per the records_per_file option), and the last one will likely 
have a number of records <= 500.

Also run riffs on the fixed query via the Metron Alerts UI PCAP query panel.


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578642#comment-16578642
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1157#discussion_r209689930
  
--- Diff: 
metron-platform/metron-pcap/src/main/java/org/apache/metron/pcap/mr/PcapJob.java
 ---
@@ -307,8 +307,11 @@ public void setCompleteCheckInterval(long interval) {
   }
   return this;
 }
-mrJob.submit();
-jobStatus.withState(State.SUBMITTED).withDescription("Job 
submitted").withJobId(mrJob.getJobID().toString());
+synchronized (this) {
--- End diff --

Will do. This lock is about thread visibility as opposed to actual issues 
with concurrent modification. It may be that this lock is not need with 
getStatus being synchronized. I will double check and report back via modified 
code and/or code comment on this.


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578627#comment-16578627
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1157
  
Good feedback @nickwallen, I'll  make adjustments.


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578626#comment-16578626
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1157#discussion_r209687780
  
--- Diff: 
metron-platform/metron-pcap/src/main/java/org/apache/metron/pcap/finalizer/PcapFinalizer.java
 ---
@@ -99,10 +104,55 @@ protected PcapResultsWriter getResultsWriter() {
 LOG.warn("Unable to cleanup files in HDFS", e);
   }
 }
+LOG.info("Done finalizing results");
 return new PcapPages(outFiles);
   }
 
-  protected abstract void write(PcapResultsWriter resultsWriter, 
Configuration hadoopConfig, List data, Path outputPath) throws 
IOException;
+  /**
+   * Figure out how many threads to use in the thread pool. If it's a 
string and ends with "C",
+   * then strip the C and treat it as an integral multiple of the number 
of cores.  If it's a
+   * string and does not end with a C, then treat it as a number in string 
form.
+   */
+  private static int getNumThreads(String numThreads) {
+String numThreadsStr = ((String) numThreads).trim().toUpperCase();
+if (numThreadsStr.endsWith("C")) {
+  Integer factor = Integer.parseInt(numThreadsStr.replace("C", ""));
+  return factor * Runtime.getRuntime().availableProcessors();
+} else {
+  return Integer.parseInt(numThreadsStr);
+}
+  }
+
+  protected List writeParallel(Configuration hadoopConfig, Map> toWrite,
+  int parallelism) throws IOException {
+List outFiles = Collections.synchronizedList(new ArrayList<>());
+ForkJoinPool tp = new ForkJoinPool(parallelism);
+try {
+  tp.submit(() -> {
+toWrite.entrySet().parallelStream().forEach(e -> {
--- End diff --

As I understand it, submit is effectively submitting the set of tasks for 
the parallel stream to execute within this threadpool, e.g. 
https://www.baeldung.com/java-8-parallel-streams-custom-threadpool. As a side 
note, the reason for a custom threadpool at all is so that this doesn't cause 
issues with other streams since the default in Java is to use a global context 
for this sort of thing. Liveness issues may arise when using the shared global 
context.


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578569#comment-16578569
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1157#discussion_r209649851
  
--- Diff: metron-interface/metron-rest/README.md ---
@@ -223,6 +223,9 @@ REST will supply the script with raw pcap data through 
standard in and expects P
 
 Pcap query jobs can be configured for submission to a YARN queue.  This 
setting is exposed as the Spring property `pcap.yarn.queue`.  If configured, 
the REST application will set the `mapreduce.job.queuename` Hadoop property to 
that value.
 
+Pcap query jobs have a finalization routine that writes their results out 
to HDFS in pages. There is a threadpool used for this finalization that can be 
configured to use a specified number of threads.
+This setting is exposed as the Spring property 
`pcap.finalizer.threadpool.size`
--- End diff --

Can we document the default value for this?


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578570#comment-16578570
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1157#discussion_r209651293
  
--- Diff: metron-interface/metron-rest/README.md ---
@@ -223,6 +223,9 @@ REST will supply the script with raw pcap data through 
standard in and expects P
 
 Pcap query jobs can be configured for submission to a YARN queue.  This 
setting is exposed as the Spring property `pcap.yarn.queue`.  If configured, 
the REST application will set the `mapreduce.job.queuename` Hadoop property to 
that value.
 
+Pcap query jobs have a finalization routine that writes their results out 
to HDFS in pages. There is a threadpool used for this finalization that can be 
configured to use a specified number of threads.
+This setting is exposed as the Spring property 
`pcap.finalizer.threadpool.size`
--- End diff --

Should we mention that 1C, 4C are valid values in addition to integers?  
Perhaps just copy the text you have in the Ambari description into the README.  
Good stuff.


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578574#comment-16578574
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1157#discussion_r209665613
  
--- Diff: 
metron-platform/metron-pcap/src/main/java/org/apache/metron/pcap/mr/PcapJob.java
 ---
@@ -307,8 +307,11 @@ public void setCompleteCheckInterval(long interval) {
   }
   return this;
 }
-mrJob.submit();
-jobStatus.withState(State.SUBMITTED).withDescription("Job 
submitted").withJobId(mrJob.getJobID().toString());
+synchronized (this) {
--- End diff --

Can we add a comment about why we need the lock here?


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578571#comment-16578571
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1157#discussion_r209655011
  
--- Diff: 
metron-platform/metron-pcap/src/main/java/org/apache/metron/pcap/finalizer/PcapFinalizer.java
 ---
@@ -99,10 +104,55 @@ protected PcapResultsWriter getResultsWriter() {
 LOG.warn("Unable to cleanup files in HDFS", e);
   }
 }
+LOG.info("Done finalizing results");
 return new PcapPages(outFiles);
   }
 
-  protected abstract void write(PcapResultsWriter resultsWriter, 
Configuration hadoopConfig, List data, Path outputPath) throws 
IOException;
+  /**
+   * Figure out how many threads to use in the thread pool. If it's a 
string and ends with "C",
+   * then strip the C and treat it as an integral multiple of the number 
of cores.  If it's a
+   * string and does not end with a C, then treat it as a number in string 
form.
+   */
+  private static int getNumThreads(String numThreads) {
+String numThreadsStr = ((String) numThreads).trim().toUpperCase();
+if (numThreadsStr.endsWith("C")) {
+  Integer factor = Integer.parseInt(numThreadsStr.replace("C", ""));
--- End diff --

Should we add a catch block for when a user enters an invalid value?  We 
should catch and provide a helpful exception message like "Invalid value for 
property 'pcap.finalizer.threadpool.size'; value='3CCC'".


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578572#comment-16578572
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1157#discussion_r209674410
  
--- Diff: 
metron-platform/metron-pcap/src/main/java/org/apache/metron/pcap/finalizer/PcapFinalizer.java
 ---
@@ -99,10 +104,55 @@ protected PcapResultsWriter getResultsWriter() {
 LOG.warn("Unable to cleanup files in HDFS", e);
   }
 }
+LOG.info("Done finalizing results");
 return new PcapPages(outFiles);
   }
 
-  protected abstract void write(PcapResultsWriter resultsWriter, 
Configuration hadoopConfig, List data, Path outputPath) throws 
IOException;
+  /**
+   * Figure out how many threads to use in the thread pool. If it's a 
string and ends with "C",
+   * then strip the C and treat it as an integral multiple of the number 
of cores.  If it's a
+   * string and does not end with a C, then treat it as a number in string 
form.
+   */
+  private static int getNumThreads(String numThreads) {
+String numThreadsStr = ((String) numThreads).trim().toUpperCase();
+if (numThreadsStr.endsWith("C")) {
+  Integer factor = Integer.parseInt(numThreadsStr.replace("C", ""));
+  return factor * Runtime.getRuntime().availableProcessors();
+} else {
+  return Integer.parseInt(numThreadsStr);
+}
+  }
+
+  protected List writeParallel(Configuration hadoopConfig, Map> toWrite,
+  int parallelism) throws IOException {
+List outFiles = Collections.synchronizedList(new ArrayList<>());
+ForkJoinPool tp = new ForkJoinPool(parallelism);
+try {
+  tp.submit(() -> {
+toWrite.entrySet().parallelStream().forEach(e -> {
--- End diff --

Shouldn't we be calling `tp.submit` for each (path, data)?  


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578573#comment-16578573
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1157#discussion_r209671313
  
--- Diff: 
metron-platform/metron-pcap/src/main/java/org/apache/metron/pcap/finalizer/PcapFinalizer.java
 ---
@@ -99,10 +104,55 @@ protected PcapResultsWriter getResultsWriter() {
 LOG.warn("Unable to cleanup files in HDFS", e);
   }
 }
+LOG.info("Done finalizing results");
 return new PcapPages(outFiles);
   }
 
-  protected abstract void write(PcapResultsWriter resultsWriter, 
Configuration hadoopConfig, List data, Path outputPath) throws 
IOException;
+  /**
+   * Figure out how many threads to use in the thread pool. If it's a 
string and ends with "C",
+   * then strip the C and treat it as an integral multiple of the number 
of cores.  If it's a
+   * string and does not end with a C, then treat it as a number in string 
form.
+   */
+  private static int getNumThreads(String numThreads) {
+String numThreadsStr = ((String) numThreads).trim().toUpperCase();
+if (numThreadsStr.endsWith("C")) {
+  Integer factor = Integer.parseInt(numThreadsStr.replace("C", ""));
+  return factor * Runtime.getRuntime().availableProcessors();
+} else {
+  return Integer.parseInt(numThreadsStr);
+}
+  }
+
+  protected List writeParallel(Configuration hadoopConfig, Map> toWrite,
+  int parallelism) throws IOException {
+List outFiles = Collections.synchronizedList(new ArrayList<>());
+ForkJoinPool tp = new ForkJoinPool(parallelism);
+try {
+  tp.submit(() -> {
+toWrite.entrySet().parallelStream().forEach(e -> {
+  try {
+Path path = e.getKey();
+List data = e.getValue();
+if (data.size() > 0) {
+  write(getResultsWriter(), hadoopConfig, data, path);
+  outFiles.add(path);
+}
+  } catch (IOException ioe) {
+throw new RuntimeException("Failed to write results", ioe);
--- End diff --

Can we add the path that failed to write to the exception message?


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578568#comment-16578568
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1157#discussion_r209650724
  
--- Diff: metron-interface/metron-rest/README.md ---
@@ -223,6 +223,9 @@ REST will supply the script with raw pcap data through 
standard in and expects P
 
 Pcap query jobs can be configured for submission to a YARN queue.  This 
setting is exposed as the Spring property `pcap.yarn.queue`.  If configured, 
the REST application will set the `mapreduce.job.queuename` Hadoop property to 
that value.
 
+Pcap query jobs have a finalization routine that writes their results out 
to HDFS in pages. There is a threadpool used for this finalization that can be 
configured to use a specified number of threads.
+This setting is exposed as the Spring property 
`pcap.finalizer.threadpool.size`
--- End diff --

Do you have any advice on when a user should increase/decrease this value?  
Are there errors I might see that would be resolved by increasing/decreasing 
this value?

If you don't have a good understanding of this, then we don't need to worry 
about it.


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578426#comment-16578426
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/1157#discussion_r209649720
  
--- Diff: metron-interface/metron-rest/README.md ---
@@ -223,6 +223,9 @@ REST will supply the script with raw pcap data through 
standard in and expects P
 
 Pcap query jobs can be configured for submission to a YARN queue.  This 
setting is exposed as the Spring property `pcap.yarn.queue`.  If configured, 
the REST application will set the `mapreduce.job.queuename` Hadoop property to 
that value.
 
+Pcap query jobs have a finalization routine that writes their results out 
to HDFS in pages. There is a threadpool used for this finalization that can be 
configured to use a specified number of threads.
+This setting is exposed as the Spring property 
`pcap.finalizer.threadpool.size`
--- End diff --

Can we document the default value for this?


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577320#comment-16577320
 ] 

ASF GitHub Bot commented on METRON-1732:


Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1157
  
This PR also updates the status reporting to include 25% of the progress to 
include the finalizer. Testing locally found that a query via the Alerts PCAP 
UI with page size set small (10 results per page), resulting in 7,299 pages 
took 15 minutes with parallelism set to 1. With parallelism set to 8 it went 
down to 2-3 minutes.


> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing

2018-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575759#comment-16575759
 ] 

ASF GitHub Bot commented on METRON-1732:


GitHub user mmiklavc opened a pull request:

https://github.com/apache/metron/pull/1157

METRON-1732: Fix job status liveness bug and parallelize finalizer file 
writing

## Contributor Comments

This still needs to have the # of finalizer threads option exposed for the 
REST application, but since it's multi-threaded code I wanted to get the review 
process started while I finish that part up.

Test plan and more detailed description to follow.

## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [ ] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [ ] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [ ] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [ ] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [ ] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mmiklavc/metron parallel-hdfs-write

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/1157.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1157


commit 0b887fb64f9e4f2682e454e69b42b0b1014f3f4d
Author: Michael Miklavcic 
Date:   2018-08-10T05:11:59Z

Parallelize finalizer writing




> Fix job status liveness bug and parallelize finalizer file writing
> --
>
> Key: METRON-1732
> URL: https://issues.apache.org/jira/browse/METRON-1732
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)