[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033346
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngineTest.java
 ##
 @@ -0,0 +1,127 @@
+package org.apache.sdap.mudrod.discoveryengine;
+
+import static org.junit.Assert.*;
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.FileNotFoundException;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Properties;
+import org.apache.sdap.mudrod.driver.ESDriver;
+import org.apache.sdap.mudrod.driver.SparkDriver;
+import org.apache.sdap.mudrod.main.AbstractElasticsearchIntegrationTest;
+import org.apache.sdap.mudrod.main.MudrodConstants;
+import org.apache.sdap.mudrod.main.MudrodEngine;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class WeblogDiscoveryEngineTest extends 
AbstractElasticsearchIntegrationTest {
+
+  private static WeblogDiscoveryEngine weblogEngine = null;
+
+  @BeforeClass
+  public static void setUp() {
+MudrodEngine mudrodEngine = new MudrodEngine();
+Properties props = mudrodEngine.loadConfig();
+ESDriver es = new ESDriver(props);
+SparkDriver spark = new SparkDriver(props);
+String dataDir = getTestDataPath();
+System.out.println(dataDir);
+props.setProperty(MudrodConstants.DATA_DIR, dataDir);
+MudrodEngine.loadPathConfig(mudrodEngine, dataDir);
+weblogEngine = new WeblogDiscoveryEngine(props, es, spark);
+  }
+
+  @AfterClass
+  public static void tearDown() {
+// TODO
+  }
+
+  private static String getTestDataPath() {
+File resourcesDirectory = new File("src/test/resources/");
+String resourcedir = "/Testing_Data_1_3dayLog+Meta+Onto/";
+String dataDir = resourcesDirectory.getAbsolutePath() + resourcedir;
+return dataDir;
+  }
+
+  @Test
+  public void testPreprocess() throws IOException {
+
+weblogEngine.preprocess();
+testPreprocess_userHistory();
+testPreprocess_clickStream();
+  }
+
+  private void testPreprocess_userHistory() throws IOException {
 
 Review comment:
   Also, you should always document your tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033834
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/driver/PluginConfigurableNode.java
 ##
 @@ -0,0 +1,14 @@
+package org.apache.sdap.mudrod.driver;
 
 Review comment:
   License header


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033850
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/driver/PluginConfigurableNode.java
 ##
 @@ -0,0 +1,14 @@
+package org.apache.sdap.mudrod.driver;
 
 Review comment:
   What is this even doing?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033999
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/weblog/structure/RequestUrlTest.java
 ##
 @@ -0,0 +1,75 @@
+package org.apache.sdap.mudrod.weblog.structure;
 
 Review comment:
   License header. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033357
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngineTest.java
 ##
 @@ -0,0 +1,127 @@
+package org.apache.sdap.mudrod.discoveryengine;
+
+import static org.junit.Assert.*;
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.FileNotFoundException;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Properties;
+import org.apache.sdap.mudrod.driver.ESDriver;
+import org.apache.sdap.mudrod.driver.SparkDriver;
+import org.apache.sdap.mudrod.main.AbstractElasticsearchIntegrationTest;
+import org.apache.sdap.mudrod.main.MudrodConstants;
+import org.apache.sdap.mudrod.main.MudrodEngine;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class WeblogDiscoveryEngineTest extends 
AbstractElasticsearchIntegrationTest {
+
+  private static WeblogDiscoveryEngine weblogEngine = null;
+
+  @BeforeClass
+  public static void setUp() {
+MudrodEngine mudrodEngine = new MudrodEngine();
+Properties props = mudrodEngine.loadConfig();
+ESDriver es = new ESDriver(props);
+SparkDriver spark = new SparkDriver(props);
+String dataDir = getTestDataPath();
+System.out.println(dataDir);
+props.setProperty(MudrodConstants.DATA_DIR, dataDir);
+MudrodEngine.loadPathConfig(mudrodEngine, dataDir);
+weblogEngine = new WeblogDiscoveryEngine(props, es, spark);
+  }
+
+  @AfterClass
+  public static void tearDown() {
+// TODO
+  }
+
+  private static String getTestDataPath() {
+File resourcesDirectory = new File("src/test/resources/");
+String resourcedir = "/Testing_Data_1_3dayLog+Meta+Onto/";
+String dataDir = resourcesDirectory.getAbsolutePath() + resourcedir;
+return dataDir;
+  }
+
+  @Test
+  public void testPreprocess() throws IOException {
+
+weblogEngine.preprocess();
+testPreprocess_userHistory();
+testPreprocess_clickStream();
+  }
+
+  private void testPreprocess_userHistory() throws IOException {
+// compare user history data
+String userHistorycsvFile = getTestDataPath() + "/userHistoryMatrix.csv";
+BufferedReader br = new BufferedReader(new FileReader(userHistorycsvFile));
+String line = null;
+HashMap> map = new HashMap<>();
+int i = 0;
+List header = new LinkedList<>();
+while ((line = br.readLine()) != null) {
+  if (i == 0) {
+String str[] = line.split(",");
+for (String s : str) {
+  header.add(s);
+}
+  } else {
+String str[] = line.split(",");
+for (int j = 1; j < str.length; j++) {
+  if (!str[j].equals("0")) {
+if (!map.containsKey(str[0])) {
+  map.put(str[0], new ArrayList<>());
+}
+map.get(str[0]).add(header.get(j));
+  }
+}
+  }
+  i += 1;
+}
+
+Assert.assertEquals("failed in history data result!", "195.219.98.7", 
String.join(",", map.get("sea surface topography")));
+  }
+
+  private void testPreprocess_clickStream() throws IOException {
 
 Review comment:
   Document test.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238034514
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/weblog/structure/log/GeoIpTest.java
 ##
 @@ -0,0 +1,18 @@
+package org.apache.sdap.mudrod.weblog.structure.log;
+
+import static org.junit.Assert.*;
+
+import org.junit.Assert;
+import org.junit.Test;
+
+public class GeoIpTest {
+
+   @Test
+   public void testToLocation() {
+   GeoIp ip = new GeoIp();
+   String iptest = "185.10.104.194";
+   Coordinates result = ip.toLocation(iptest);
+   Assert.assertEquals("failed in geoip function!", 
"22.283001,114.150002", result.latlon);
+   }
+
+}
 
 Review comment:
   We cannot ship binary code inside of source code management... the resources 
below should be decompressed.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033924
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/main/AbstractElasticsearchIntegrationTest.java
 ##
 @@ -0,0 +1,37 @@
+package org.apache.sdap.mudrod.main;
+
+import org.apache.sdap.mudrod.driver.EmbeddedElasticsearchServer;
+import org.apache.sdap.mudrod.driver.EmbeddedElasticsearchServer;
+import org.elasticsearch.client.Client;
+import org.junit.After;
+import org.junit.AfterClass;
+import org.junit.Before;
+import org.junit.BeforeClass;
+
+/**
+ * This is a helper class the starts an embedded elasticsearch server
+ * for each test.
+ *
+ * @author Felix Müller
 
 Review comment:
   Who is Felix Muller?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705442#comment-16705442
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032665
 
 

 ##
 File path: core/src/main/java/org/apache/sdap/mudrod/main/MudrodEngine.java
 ##
 @@ -391,12 +391,12 @@ public static void main(String[] args) {
   me.end();
 } catch (Exception e) {
   HelpFormatter formatter = new HelpFormatter();
-  formatter.printHelp("MudrodEngine: 'dataDir' argument is mandatory. " + 
"User must also provide an ingest method.", true);
+  formatter.printHelp("MudrodEngine: 'dataDir' argument is mandatory. " + 
"User must also provide an ingest method.", options, true);
   LOG.error("Error whilst parsing command line.", e);
 }
   }
 
-  private static void loadPathConfig(MudrodEngine me, String dataDir) {
+  public static void loadPathConfig(MudrodEngine me, String dataDir) {
 
 Review comment:
   Why does this need to be public?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705463#comment-16705463
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033850
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/driver/PluginConfigurableNode.java
 ##
 @@ -0,0 +1,14 @@
+package org.apache.sdap.mudrod.driver;
 
 Review comment:
   What is this even doing?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705462#comment-16705462
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238034182
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/weblog/structure/log/GeoIpTest.java
 ##
 @@ -0,0 +1,18 @@
+package org.apache.sdap.mudrod.weblog.structure.log;
+
+import static org.junit.Assert.*;
 
 Review comment:
   Never use wildcard imports. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705459#comment-16705459
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238034053
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/weblog/structure/RequestUrlTest.java
 ##
 @@ -0,0 +1,75 @@
+package org.apache.sdap.mudrod.weblog.structure;
+
+import static org.junit.Assert.*;
+
+import java.io.UnsupportedEncodingException;
+import java.util.Map;
+
+import org.apache.sdap.mudrod.weblog.structure.log.RequestUrl;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class RequestUrlTest {
+   
+// @BeforeClass
 
 Review comment:
   Never leave code commented out like this it is extremely untidy. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705454#comment-16705454
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033385
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/driver/EmbeddedElasticsearchServer.java
 ##
 @@ -0,0 +1,74 @@
+package org.apache.sdap.mudrod.driver;
 
 Review comment:
   License header. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705455#comment-16705455
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033346
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngineTest.java
 ##
 @@ -0,0 +1,127 @@
+package org.apache.sdap.mudrod.discoveryengine;
+
+import static org.junit.Assert.*;
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.FileNotFoundException;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Properties;
+import org.apache.sdap.mudrod.driver.ESDriver;
+import org.apache.sdap.mudrod.driver.SparkDriver;
+import org.apache.sdap.mudrod.main.AbstractElasticsearchIntegrationTest;
+import org.apache.sdap.mudrod.main.MudrodConstants;
+import org.apache.sdap.mudrod.main.MudrodEngine;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class WeblogDiscoveryEngineTest extends 
AbstractElasticsearchIntegrationTest {
+
+  private static WeblogDiscoveryEngine weblogEngine = null;
+
+  @BeforeClass
+  public static void setUp() {
+MudrodEngine mudrodEngine = new MudrodEngine();
+Properties props = mudrodEngine.loadConfig();
+ESDriver es = new ESDriver(props);
+SparkDriver spark = new SparkDriver(props);
+String dataDir = getTestDataPath();
+System.out.println(dataDir);
+props.setProperty(MudrodConstants.DATA_DIR, dataDir);
+MudrodEngine.loadPathConfig(mudrodEngine, dataDir);
+weblogEngine = new WeblogDiscoveryEngine(props, es, spark);
+  }
+
+  @AfterClass
+  public static void tearDown() {
+// TODO
+  }
+
+  private static String getTestDataPath() {
+File resourcesDirectory = new File("src/test/resources/");
+String resourcedir = "/Testing_Data_1_3dayLog+Meta+Onto/";
+String dataDir = resourcesDirectory.getAbsolutePath() + resourcedir;
+return dataDir;
+  }
+
+  @Test
+  public void testPreprocess() throws IOException {
+
+weblogEngine.preprocess();
+testPreprocess_userHistory();
+testPreprocess_clickStream();
+  }
+
+  private void testPreprocess_userHistory() throws IOException {
 
 Review comment:
   Also, you should always document your tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705448#comment-16705448
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033014
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngineTest.java
 ##
 @@ -0,0 +1,127 @@
+package org.apache.sdap.mudrod.discoveryengine;
+
+import static org.junit.Assert.*;
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.FileNotFoundException;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Properties;
+import org.apache.sdap.mudrod.driver.ESDriver;
+import org.apache.sdap.mudrod.driver.SparkDriver;
+import org.apache.sdap.mudrod.main.AbstractElasticsearchIntegrationTest;
+import org.apache.sdap.mudrod.main.MudrodConstants;
+import org.apache.sdap.mudrod.main.MudrodEngine;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class WeblogDiscoveryEngineTest extends 
AbstractElasticsearchIntegrationTest {
+
+  private static WeblogDiscoveryEngine weblogEngine = null;
+
+  @BeforeClass
+  public static void setUp() {
+MudrodEngine mudrodEngine = new MudrodEngine();
+Properties props = mudrodEngine.loadConfig();
+ESDriver es = new ESDriver(props);
+SparkDriver spark = new SparkDriver(props);
+String dataDir = getTestDataPath();
+System.out.println(dataDir);
+props.setProperty(MudrodConstants.DATA_DIR, dataDir);
+MudrodEngine.loadPathConfig(mudrodEngine, dataDir);
+weblogEngine = new WeblogDiscoveryEngine(props, es, spark);
+  }
+
+  @AfterClass
+  public static void tearDown() {
+// TODO
+  }
+
+  private static String getTestDataPath() {
+File resourcesDirectory = new File("src/test/resources/");
 
 Review comment:
   Use the Java ClassLoader...
   ```
   getClass().getCLassLoader().getResource


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705450#comment-16705450
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033357
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngineTest.java
 ##
 @@ -0,0 +1,127 @@
+package org.apache.sdap.mudrod.discoveryengine;
+
+import static org.junit.Assert.*;
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.FileNotFoundException;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Properties;
+import org.apache.sdap.mudrod.driver.ESDriver;
+import org.apache.sdap.mudrod.driver.SparkDriver;
+import org.apache.sdap.mudrod.main.AbstractElasticsearchIntegrationTest;
+import org.apache.sdap.mudrod.main.MudrodConstants;
+import org.apache.sdap.mudrod.main.MudrodEngine;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class WeblogDiscoveryEngineTest extends 
AbstractElasticsearchIntegrationTest {
+
+  private static WeblogDiscoveryEngine weblogEngine = null;
+
+  @BeforeClass
+  public static void setUp() {
+MudrodEngine mudrodEngine = new MudrodEngine();
+Properties props = mudrodEngine.loadConfig();
+ESDriver es = new ESDriver(props);
+SparkDriver spark = new SparkDriver(props);
+String dataDir = getTestDataPath();
+System.out.println(dataDir);
+props.setProperty(MudrodConstants.DATA_DIR, dataDir);
+MudrodEngine.loadPathConfig(mudrodEngine, dataDir);
+weblogEngine = new WeblogDiscoveryEngine(props, es, spark);
+  }
+
+  @AfterClass
+  public static void tearDown() {
+// TODO
+  }
+
+  private static String getTestDataPath() {
+File resourcesDirectory = new File("src/test/resources/");
+String resourcedir = "/Testing_Data_1_3dayLog+Meta+Onto/";
+String dataDir = resourcesDirectory.getAbsolutePath() + resourcedir;
+return dataDir;
+  }
+
+  @Test
+  public void testPreprocess() throws IOException {
+
+weblogEngine.preprocess();
+testPreprocess_userHistory();
+testPreprocess_clickStream();
+  }
+
+  private void testPreprocess_userHistory() throws IOException {
+// compare user history data
+String userHistorycsvFile = getTestDataPath() + "/userHistoryMatrix.csv";
+BufferedReader br = new BufferedReader(new FileReader(userHistorycsvFile));
+String line = null;
+HashMap> map = new HashMap<>();
+int i = 0;
+List header = new LinkedList<>();
+while ((line = br.readLine()) != null) {
+  if (i == 0) {
+String str[] = line.split(",");
+for (String s : str) {
+  header.add(s);
+}
+  } else {
+String str[] = line.split(",");
+for (int j = 1; j < str.length; j++) {
+  if (!str[j].equals("0")) {
+if (!map.containsKey(str[0])) {
+  map.put(str[0], new ArrayList<>());
+}
+map.get(str[0]).add(header.get(j));
+  }
+}
+  }
+  i += 1;
+}
+
+Assert.assertEquals("failed in history data result!", "195.219.98.7", 
String.join(",", map.get("sea surface topography")));
+  }
+
+  private void testPreprocess_clickStream() throws IOException {
 
 Review comment:
   Document test.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705460#comment-16705460
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033834
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/driver/PluginConfigurableNode.java
 ##
 @@ -0,0 +1,14 @@
+package org.apache.sdap.mudrod.driver;
 
 Review comment:
   License header


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705456#comment-16705456
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033243
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngineTest.java
 ##
 @@ -0,0 +1,127 @@
+package org.apache.sdap.mudrod.discoveryengine;
+
+import static org.junit.Assert.*;
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.FileNotFoundException;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Properties;
+import org.apache.sdap.mudrod.driver.ESDriver;
+import org.apache.sdap.mudrod.driver.SparkDriver;
+import org.apache.sdap.mudrod.main.AbstractElasticsearchIntegrationTest;
+import org.apache.sdap.mudrod.main.MudrodConstants;
+import org.apache.sdap.mudrod.main.MudrodEngine;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class WeblogDiscoveryEngineTest extends 
AbstractElasticsearchIntegrationTest {
+
+  private static WeblogDiscoveryEngine weblogEngine = null;
+
+  @BeforeClass
+  public static void setUp() {
+MudrodEngine mudrodEngine = new MudrodEngine();
+Properties props = mudrodEngine.loadConfig();
+ESDriver es = new ESDriver(props);
+SparkDriver spark = new SparkDriver(props);
+String dataDir = getTestDataPath();
+System.out.println(dataDir);
+props.setProperty(MudrodConstants.DATA_DIR, dataDir);
+MudrodEngine.loadPathConfig(mudrodEngine, dataDir);
+weblogEngine = new WeblogDiscoveryEngine(props, es, spark);
+  }
+
+  @AfterClass
+  public static void tearDown() {
+// TODO
+  }
+
+  private static String getTestDataPath() {
+File resourcesDirectory = new File("src/test/resources/");
+String resourcedir = "/Testing_Data_1_3dayLog+Meta+Onto/";
+String dataDir = resourcesDirectory.getAbsolutePath() + resourcedir;
+return dataDir;
+  }
+
+  @Test
+  public void testPreprocess() throws IOException {
+
+weblogEngine.preprocess();
+testPreprocess_userHistory();
+testPreprocess_clickStream();
+  }
+
+  private void testPreprocess_userHistory() throws IOException {
 
 Review comment:
   Tests in JUnit are annotated with @Test. See 
https://junit.org/junit4/javadoc/latest/index.html?org/junit/Test.html


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705452#comment-16705452
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238034104
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/weblog/structure/RequestUrlTest.java
 ##
 @@ -0,0 +1,75 @@
+package org.apache.sdap.mudrod.weblog.structure;
+
+import static org.junit.Assert.*;
+
+import java.io.UnsupportedEncodingException;
+import java.util.Map;
+
+import org.apache.sdap.mudrod.weblog.structure.log.RequestUrl;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class RequestUrlTest {
+   
+// @BeforeClass
+//public void setUp(){
+// RequestUrl url = new RequestUrl();
+// 
+//}
+// @Test
+// public void testRequestUrl() {
+// fail("Not yet implemented");
+// }
+
+   @Test
+   public void testUrlPage() {
+   RequestUrl url = new RequestUrl();
+   String strURL = 
"https://podaac.jpl.nasa.gov/datasetlist?ids=Collections:Measurement:SpatialCoverage:Platform:Sensor&values=SCAT_BYU_L3_OW_SIGMA0_ENHANCED:Sea%20Ice:Bering%20Sea:ERS-2:AMI&view=list";;
+   String result = url.urlPage(strURL);
+   //System.out.println(urlPage);
+   ///fail("Not yet implemented");
+   Assert.assertEquals("You did not pass urlPage function ", 
"https://podaac.jpl.nasa.gov/datasetlist";, result);
+   }
+
+   @Test
+   public void testuRLRequest() {
+   RequestUrl url = new RequestUrl();
+   String strURL = 
"https://podaac.jpl.nasa.gov/datasetlist?ids=Collections:Measurement:SpatialCoverage:Platform:Sensor&values=SCAT_BYU_L3_OW_SIGMA0_ENHANCED:Sea%20Ice:Bering%20Sea:ERS-2:AMI&view=list";;
 
 Review comment:
   Why is this not a constant?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705439#comment-16705439
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032419
 
 

 ##
 File path: 
core/src/main/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngine.java
 ##
 @@ -105,7 +105,7 @@ public void preprocess() {
   startTime = System.currentTimeMillis();
   LOG.info("Processing logs dated {}", anInputList);
 
-  DiscoveryStepAbstract im = new ImportLogFile(this.props, this.es, 
this.spark);
+ /* DiscoveryStepAbstract im = new ImportLogFile(this.props, this.es, 
this.spark);
 
 Review comment:
   OK, the issue SDAP-161 is described as being **MUDROD embedded unit test** 
there should therefore be no code other than en embedded Unit test...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705445#comment-16705445
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032819
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngineTest.java
 ##
 @@ -0,0 +1,127 @@
+package org.apache.sdap.mudrod.discoveryengine;
 
 Review comment:
   Provide license header.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705458#comment-16705458
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238034551
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/weblog/structure/log/GeoIpTest.java
 ##
 @@ -0,0 +1,18 @@
+package org.apache.sdap.mudrod.weblog.structure.log;
+
+import static org.junit.Assert.*;
+
+import org.junit.Assert;
+import org.junit.Test;
+
+public class GeoIpTest {
+
+   @Test
+   public void testToLocation() {
+   GeoIp ip = new GeoIp();
+   String iptest = "185.10.104.194";
+   Coordinates result = ip.toLocation(iptest);
+   Assert.assertEquals("failed in geoip function!", 
"22.283001,114.150002", result.latlon);
+   }
+
+}
 
 Review comment:
   Also, is there any reason we are using logs from 2015? Why not 2018?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705441#comment-16705441
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032482
 
 

 ##
 File path: 
core/src/main/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngine.java
 ##
 @@ -118,15 +118,15 @@ public void preprocess() {
   ss.execute();
 
   DiscoveryStepAbstract rr = new RemoveRawLog(this.props, this.es, 
this.spark);
-  rr.execute();
+  rr.execute();*/
 
   endTime = System.currentTimeMillis();
 
   LOG.info("Web log preprocessing for logs dated {} complete. Time elapsed 
{} seconds.", anInputList, (endTime - startTime) / 1000);
 }
 
-DiscoveryStepAbstract hg = new HistoryGenerator(this.props, this.es, 
this.spark);
-hg.execute();
+/*DiscoveryStepAbstract hg = new HistoryGenerator(this.props, this.es, 
this.spark);
 
 Review comment:
   Same here...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705461#comment-16705461
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238034160
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/weblog/structure/log/GeoIpTest.java
 ##
 @@ -0,0 +1,18 @@
+package org.apache.sdap.mudrod.weblog.structure.log;
 
 Review comment:
   Missing license header. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705457#comment-16705457
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238034020
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/weblog/structure/RequestUrlTest.java
 ##
 @@ -0,0 +1,75 @@
+package org.apache.sdap.mudrod.weblog.structure;
+
+import static org.junit.Assert.*;
 
 Review comment:
   Never use wildcard imports. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705446#comment-16705446
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032872
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngineTest.java
 ##
 @@ -0,0 +1,127 @@
+package org.apache.sdap.mudrod.discoveryengine;
+
+import static org.junit.Assert.*;
 
 Review comment:
   Never use wildcard imports. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705444#comment-16705444
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033873
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/main/AbstractElasticsearchIntegrationTest.java
 ##
 @@ -0,0 +1,37 @@
+package org.apache.sdap.mudrod.main;
 
 Review comment:
   License header. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705451#comment-16705451
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238034514
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/weblog/structure/log/GeoIpTest.java
 ##
 @@ -0,0 +1,18 @@
+package org.apache.sdap.mudrod.weblog.structure.log;
+
+import static org.junit.Assert.*;
+
+import org.junit.Assert;
+import org.junit.Test;
+
+public class GeoIpTest {
+
+   @Test
+   public void testToLocation() {
+   GeoIp ip = new GeoIp();
+   String iptest = "185.10.104.194";
+   Coordinates result = ip.toLocation(iptest);
+   Assert.assertEquals("failed in geoip function!", 
"22.283001,114.150002", result.latlon);
+   }
+
+}
 
 Review comment:
   We cannot ship binary code inside of source code management... the resources 
below should be decompressed.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705438#comment-16705438
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032781
 
 

 ##
 File path: 
core/src/main/java/org/apache/sdap/mudrod/weblog/structure/session/SessionTree.java
 ##
 @@ -205,7 +208,8 @@ public JsonObject treeToJson(SessionNode node) {
   RequestUrl requestURL = new RequestUrl();
   String viewquery = "";
   try {
-String infoStr = requestURL.getSearchInfo(viewnode.getRequest());
+//String infoStr = requestURL.getSearchInfo(viewnode.getRequest());
 
 Review comment:
   Do not comment out code...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705440#comment-16705440
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032627
 
 

 ##
 File path: core/src/main/java/org/apache/sdap/mudrod/main/MudrodEngine.java
 ##
 @@ -391,12 +391,12 @@ public static void main(String[] args) {
   me.end();
 } catch (Exception e) {
   HelpFormatter formatter = new HelpFormatter();
-  formatter.printHelp("MudrodEngine: 'dataDir' argument is mandatory. " + 
"User must also provide an ingest method.", true);
+  formatter.printHelp("MudrodEngine: 'dataDir' argument is mandatory. " + 
"User must also provide an ingest method.", options, true);
 
 Review comment:
   This code is buggy. We just want to throw the Exception... not continuously 
print the help arguments. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705449#comment-16705449
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032746
 
 

 ##
 File path: 
core/src/main/java/org/apache/sdap/mudrod/weblog/pre/SessionStatistic.java
 ##
 @@ -229,7 +229,7 @@ public int processSession(ESDriver es, String sessionId) 
throws IOException, Int
   String[] keywordList = keywords.split(",");
   for (String item : items) {
 if (!Arrays.asList(keywordList).contains(item)) {
-  keywords = keywords + item + ",";
+  keywords = keywords + "," + item + ",";
 
 Review comment:
   Can you explain this addition? What was wrong with it previously? Do you 
have a unit test for it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705443#comment-16705443
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032465
 
 

 ##
 File path: 
core/src/main/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngine.java
 ##
 @@ -105,7 +105,7 @@ public void preprocess() {
   startTime = System.currentTimeMillis();
   LOG.info("Processing logs dated {}", anInputList);
 
-  DiscoveryStepAbstract im = new ImportLogFile(this.props, this.es, 
this.spark);
+ /* DiscoveryStepAbstract im = new ImportLogFile(this.props, this.es, 
this.spark);
 
 Review comment:
   Additionally, if you are going to comment out code that you do not intend to 
use... just remove it. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705447#comment-16705447
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033999
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/weblog/structure/RequestUrlTest.java
 ##
 @@ -0,0 +1,75 @@
+package org.apache.sdap.mudrod.weblog.structure;
 
 Review comment:
   License header. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-161) MUDROD embedded unit test

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705453#comment-16705453
 ] 

ASF GitHub Bot commented on SDAP-161:
-

lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033924
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/main/AbstractElasticsearchIntegrationTest.java
 ##
 @@ -0,0 +1,37 @@
+package org.apache.sdap.mudrod.main;
+
+import org.apache.sdap.mudrod.driver.EmbeddedElasticsearchServer;
+import org.apache.sdap.mudrod.driver.EmbeddedElasticsearchServer;
+import org.elasticsearch.client.Client;
+import org.junit.After;
+import org.junit.AfterClass;
+import org.junit.Before;
+import org.junit.BeforeClass;
+
+/**
+ * This is a helper class the starts an embedded elasticsearch server
+ * for each test.
+ *
+ * @author Felix Müller
 
 Review comment:
   Who is Felix Muller?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> MUDROD embedded unit test
> -
>
> Key: SDAP-161
> URL: https://issues.apache.org/jira/browse/SDAP-161
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>  Components: mudrod
>Affects Versions: 1.0
>Reporter: Yun Li
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032872
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngineTest.java
 ##
 @@ -0,0 +1,127 @@
+package org.apache.sdap.mudrod.discoveryengine;
+
+import static org.junit.Assert.*;
 
 Review comment:
   Never use wildcard imports. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032419
 
 

 ##
 File path: 
core/src/main/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngine.java
 ##
 @@ -105,7 +105,7 @@ public void preprocess() {
   startTime = System.currentTimeMillis();
   LOG.info("Processing logs dated {}", anInputList);
 
-  DiscoveryStepAbstract im = new ImportLogFile(this.props, this.es, 
this.spark);
+ /* DiscoveryStepAbstract im = new ImportLogFile(this.props, this.es, 
this.spark);
 
 Review comment:
   OK, the issue SDAP-161 is described as being **MUDROD embedded unit test** 
there should therefore be no code other than en embedded Unit test...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238034160
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/weblog/structure/log/GeoIpTest.java
 ##
 @@ -0,0 +1,18 @@
+package org.apache.sdap.mudrod.weblog.structure.log;
 
 Review comment:
   Missing license header. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032465
 
 

 ##
 File path: 
core/src/main/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngine.java
 ##
 @@ -105,7 +105,7 @@ public void preprocess() {
   startTime = System.currentTimeMillis();
   LOG.info("Processing logs dated {}", anInputList);
 
-  DiscoveryStepAbstract im = new ImportLogFile(this.props, this.es, 
this.spark);
+ /* DiscoveryStepAbstract im = new ImportLogFile(this.props, this.es, 
this.spark);
 
 Review comment:
   Additionally, if you are going to comment out code that you do not intend to 
use... just remove it. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033243
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngineTest.java
 ##
 @@ -0,0 +1,127 @@
+package org.apache.sdap.mudrod.discoveryengine;
+
+import static org.junit.Assert.*;
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.FileNotFoundException;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Properties;
+import org.apache.sdap.mudrod.driver.ESDriver;
+import org.apache.sdap.mudrod.driver.SparkDriver;
+import org.apache.sdap.mudrod.main.AbstractElasticsearchIntegrationTest;
+import org.apache.sdap.mudrod.main.MudrodConstants;
+import org.apache.sdap.mudrod.main.MudrodEngine;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class WeblogDiscoveryEngineTest extends 
AbstractElasticsearchIntegrationTest {
+
+  private static WeblogDiscoveryEngine weblogEngine = null;
+
+  @BeforeClass
+  public static void setUp() {
+MudrodEngine mudrodEngine = new MudrodEngine();
+Properties props = mudrodEngine.loadConfig();
+ESDriver es = new ESDriver(props);
+SparkDriver spark = new SparkDriver(props);
+String dataDir = getTestDataPath();
+System.out.println(dataDir);
+props.setProperty(MudrodConstants.DATA_DIR, dataDir);
+MudrodEngine.loadPathConfig(mudrodEngine, dataDir);
+weblogEngine = new WeblogDiscoveryEngine(props, es, spark);
+  }
+
+  @AfterClass
+  public static void tearDown() {
+// TODO
+  }
+
+  private static String getTestDataPath() {
+File resourcesDirectory = new File("src/test/resources/");
+String resourcedir = "/Testing_Data_1_3dayLog+Meta+Onto/";
+String dataDir = resourcesDirectory.getAbsolutePath() + resourcedir;
+return dataDir;
+  }
+
+  @Test
+  public void testPreprocess() throws IOException {
+
+weblogEngine.preprocess();
+testPreprocess_userHistory();
+testPreprocess_clickStream();
+  }
+
+  private void testPreprocess_userHistory() throws IOException {
 
 Review comment:
   Tests in JUnit are annotated with @Test. See 
https://junit.org/junit4/javadoc/latest/index.html?org/junit/Test.html


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032781
 
 

 ##
 File path: 
core/src/main/java/org/apache/sdap/mudrod/weblog/structure/session/SessionTree.java
 ##
 @@ -205,7 +208,8 @@ public JsonObject treeToJson(SessionNode node) {
   RequestUrl requestURL = new RequestUrl();
   String viewquery = "";
   try {
-String infoStr = requestURL.getSearchInfo(viewnode.getRequest());
+//String infoStr = requestURL.getSearchInfo(viewnode.getRequest());
 
 Review comment:
   Do not comment out code...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238034551
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/weblog/structure/log/GeoIpTest.java
 ##
 @@ -0,0 +1,18 @@
+package org.apache.sdap.mudrod.weblog.structure.log;
+
+import static org.junit.Assert.*;
+
+import org.junit.Assert;
+import org.junit.Test;
+
+public class GeoIpTest {
+
+   @Test
+   public void testToLocation() {
+   GeoIp ip = new GeoIp();
+   String iptest = "185.10.104.194";
+   Coordinates result = ip.toLocation(iptest);
+   Assert.assertEquals("failed in geoip function!", 
"22.283001,114.150002", result.latlon);
+   }
+
+}
 
 Review comment:
   Also, is there any reason we are using logs from 2015? Why not 2018?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032665
 
 

 ##
 File path: core/src/main/java/org/apache/sdap/mudrod/main/MudrodEngine.java
 ##
 @@ -391,12 +391,12 @@ public static void main(String[] args) {
   me.end();
 } catch (Exception e) {
   HelpFormatter formatter = new HelpFormatter();
-  formatter.printHelp("MudrodEngine: 'dataDir' argument is mandatory. " + 
"User must also provide an ingest method.", true);
+  formatter.printHelp("MudrodEngine: 'dataDir' argument is mandatory. " + 
"User must also provide an ingest method.", options, true);
   LOG.error("Error whilst parsing command line.", e);
 }
   }
 
-  private static void loadPathConfig(MudrodEngine me, String dataDir) {
+  public static void loadPathConfig(MudrodEngine me, String dataDir) {
 
 Review comment:
   Why does this need to be public?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238034020
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/weblog/structure/RequestUrlTest.java
 ##
 @@ -0,0 +1,75 @@
+package org.apache.sdap.mudrod.weblog.structure;
+
+import static org.junit.Assert.*;
 
 Review comment:
   Never use wildcard imports. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238034053
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/weblog/structure/RequestUrlTest.java
 ##
 @@ -0,0 +1,75 @@
+package org.apache.sdap.mudrod.weblog.structure;
+
+import static org.junit.Assert.*;
+
+import java.io.UnsupportedEncodingException;
+import java.util.Map;
+
+import org.apache.sdap.mudrod.weblog.structure.log.RequestUrl;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class RequestUrlTest {
+   
+// @BeforeClass
 
 Review comment:
   Never leave code commented out like this it is extremely untidy. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032746
 
 

 ##
 File path: 
core/src/main/java/org/apache/sdap/mudrod/weblog/pre/SessionStatistic.java
 ##
 @@ -229,7 +229,7 @@ public int processSession(ESDriver es, String sessionId) 
throws IOException, Int
   String[] keywordList = keywords.split(",");
   for (String item : items) {
 if (!Arrays.asList(keywordList).contains(item)) {
-  keywords = keywords + item + ",";
+  keywords = keywords + "," + item + ",";
 
 Review comment:
   Can you explain this addition? What was wrong with it previously? Do you 
have a unit test for it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238034104
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/weblog/structure/RequestUrlTest.java
 ##
 @@ -0,0 +1,75 @@
+package org.apache.sdap.mudrod.weblog.structure;
+
+import static org.junit.Assert.*;
+
+import java.io.UnsupportedEncodingException;
+import java.util.Map;
+
+import org.apache.sdap.mudrod.weblog.structure.log.RequestUrl;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class RequestUrlTest {
+   
+// @BeforeClass
+//public void setUp(){
+// RequestUrl url = new RequestUrl();
+// 
+//}
+// @Test
+// public void testRequestUrl() {
+// fail("Not yet implemented");
+// }
+
+   @Test
+   public void testUrlPage() {
+   RequestUrl url = new RequestUrl();
+   String strURL = 
"https://podaac.jpl.nasa.gov/datasetlist?ids=Collections:Measurement:SpatialCoverage:Platform:Sensor&values=SCAT_BYU_L3_OW_SIGMA0_ENHANCED:Sea%20Ice:Bering%20Sea:ERS-2:AMI&view=list";;
+   String result = url.urlPage(strURL);
+   //System.out.println(urlPage);
+   ///fail("Not yet implemented");
+   Assert.assertEquals("You did not pass urlPage function ", 
"https://podaac.jpl.nasa.gov/datasetlist";, result);
+   }
+
+   @Test
+   public void testuRLRequest() {
+   RequestUrl url = new RequestUrl();
+   String strURL = 
"https://podaac.jpl.nasa.gov/datasetlist?ids=Collections:Measurement:SpatialCoverage:Platform:Sensor&values=SCAT_BYU_L3_OW_SIGMA0_ENHANCED:Sea%20Ice:Bering%20Sea:ERS-2:AMI&view=list";;
 
 Review comment:
   Why is this not a constant?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033014
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngineTest.java
 ##
 @@ -0,0 +1,127 @@
+package org.apache.sdap.mudrod.discoveryengine;
+
+import static org.junit.Assert.*;
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.FileNotFoundException;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Properties;
+import org.apache.sdap.mudrod.driver.ESDriver;
+import org.apache.sdap.mudrod.driver.SparkDriver;
+import org.apache.sdap.mudrod.main.AbstractElasticsearchIntegrationTest;
+import org.apache.sdap.mudrod.main.MudrodConstants;
+import org.apache.sdap.mudrod.main.MudrodEngine;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class WeblogDiscoveryEngineTest extends 
AbstractElasticsearchIntegrationTest {
+
+  private static WeblogDiscoveryEngine weblogEngine = null;
+
+  @BeforeClass
+  public static void setUp() {
+MudrodEngine mudrodEngine = new MudrodEngine();
+Properties props = mudrodEngine.loadConfig();
+ESDriver es = new ESDriver(props);
+SparkDriver spark = new SparkDriver(props);
+String dataDir = getTestDataPath();
+System.out.println(dataDir);
+props.setProperty(MudrodConstants.DATA_DIR, dataDir);
+MudrodEngine.loadPathConfig(mudrodEngine, dataDir);
+weblogEngine = new WeblogDiscoveryEngine(props, es, spark);
+  }
+
+  @AfterClass
+  public static void tearDown() {
+// TODO
+  }
+
+  private static String getTestDataPath() {
+File resourcesDirectory = new File("src/test/resources/");
 
 Review comment:
   Use the Java ClassLoader...
   ```
   getClass().getCLassLoader().getResource


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033385
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/driver/EmbeddedElasticsearchServer.java
 ##
 @@ -0,0 +1,74 @@
+package org.apache.sdap.mudrod.driver;
 
 Review comment:
   License header. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032482
 
 

 ##
 File path: 
core/src/main/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngine.java
 ##
 @@ -118,15 +118,15 @@ public void preprocess() {
   ss.execute();
 
   DiscoveryStepAbstract rr = new RemoveRawLog(this.props, this.es, 
this.spark);
-  rr.execute();
+  rr.execute();*/
 
   endTime = System.currentTimeMillis();
 
   LOG.info("Web log preprocessing for logs dated {} complete. Time elapsed 
{} seconds.", anInputList, (endTime - startTime) / 1000);
 }
 
-DiscoveryStepAbstract hg = new HistoryGenerator(this.props, this.es, 
this.spark);
-hg.execute();
+/*DiscoveryStepAbstract hg = new HistoryGenerator(this.props, this.es, 
this.spark);
 
 Review comment:
   Same here...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238034182
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/weblog/structure/log/GeoIpTest.java
 ##
 @@ -0,0 +1,18 @@
+package org.apache.sdap.mudrod.weblog.structure.log;
+
+import static org.junit.Assert.*;
 
 Review comment:
   Never use wildcard imports. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238033873
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/main/AbstractElasticsearchIntegrationTest.java
 ##
 @@ -0,0 +1,37 @@
+package org.apache.sdap.mudrod.main;
 
 Review comment:
   License header. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032627
 
 

 ##
 File path: core/src/main/java/org/apache/sdap/mudrod/main/MudrodEngine.java
 ##
 @@ -391,12 +391,12 @@ public static void main(String[] args) {
   me.end();
 } catch (Exception e) {
   HelpFormatter formatter = new HelpFormatter();
-  formatter.printHelp("MudrodEngine: 'dataDir' argument is mandatory. " + 
"User must also provide an ingest method.", true);
+  formatter.printHelp("MudrodEngine: 'dataDir' argument is mandatory. " + 
"User must also provide an ingest method.", options, true);
 
 Review comment:
   This code is buggy. We just want to throw the Exception... not continuously 
print the help arguments. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded unit test

2018-11-30 Thread GitBox
lewismc commented on a change in pull request #35: SDAP-161 MUDROD embedded 
unit test
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#discussion_r238032819
 
 

 ##
 File path: 
core/src/test/java/org/apache/sdap/mudrod/discoveryengine/WeblogDiscoveryEngineTest.java
 ##
 @@ -0,0 +1,127 @@
+package org.apache.sdap.mudrod.discoveryengine;
 
 Review comment:
   Provide license header.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] quintinali commented on issue #35: Sdap 161

2018-11-30 Thread GitBox
quintinali commented on issue #35: Sdap 161
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#issuecomment-443366045
 
 
   @fgreg @lewismc  Could you please take a look at the code. As I mentioned in 
the last meeting, the EmbeddedElasticsearchServer can not work in the code and 
I can not fix it.  Could anyone help me to figure out what the problem is? If 
you compile the code, please use command "mvn clean install -Dskiptests", or 
the unit test will fail.  I run these unit test cases with eclipses and a 
elasticsearch engine installed on my computer.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] quintinali opened a new pull request #35: Sdap 161

2018-11-30 Thread GitBox
quintinali opened a new pull request #35: Sdap 161
URL: https://github.com/apache/incubator-sdap-mudrod/pull/35
 
 
   test case for log ingestion and preprocessing


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] asfgit commented on issue #35: Sdap 161

2018-11-30 Thread GitBox
asfgit commented on issue #35: Sdap 161
URL: 
https://github.com/apache/incubator-sdap-mudrod/pull/35#issuecomment-443365249
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (SDAP-151) Determine parallelism automatically for Spark analytics

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705207#comment-16705207
 ] 

ASF GitHub Bot commented on SDAP-151:
-

fgreg closed pull request #60: SDAP-151 Determine parallelism automatically for 
Spark analytics (#50)
URL: https://github.com/apache/incubator-sdap-nexus/pull/60
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Determine parallelism automatically for Spark analytics
> ---
>
> Key: SDAP-151
> URL: https://issues.apache.org/jira/browse/SDAP-151
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>Reporter: Joseph Jacob
>Assignee: Joseph Jacob
>Priority: Major
>
> Some of the built-in NEXUS analytics like TimeSeries and TimeAvgMap currently 
> get the desired parallelism from a job request parameter like 
> "spark=mesos,16,32".  If that is omitted, we currently default to 
> "spark=local,1,1", which runs on a single core.  Instead we would like to 
> automatically determine the appropriate level of parallelism based on the 
> job's input data size.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-151) Determine parallelism automatically for Spark analytics

2018-11-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SDAP-151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705206#comment-16705206
 ] 

ASF GitHub Bot commented on SDAP-151:
-

fgreg opened a new pull request #60: SDAP-151 Determine parallelism 
automatically for Spark analytics (#50)
URL: https://github.com/apache/incubator-sdap-nexus/pull/60
 
 
   * Removed spark configuration, added nparts configuration, and autocompute 
parallelism for spark-based time series.
   
   * SDAP-151 Determine parallelism automatically for Spark analytics


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Determine parallelism automatically for Spark analytics
> ---
>
> Key: SDAP-151
> URL: https://issues.apache.org/jira/browse/SDAP-151
> Project: Apache Science Data Analytics Platform
>  Issue Type: Improvement
>Reporter: Joseph Jacob
>Assignee: Joseph Jacob
>Priority: Major
>
> Some of the built-in NEXUS analytics like TimeSeries and TimeAvgMap currently 
> get the desired parallelism from a job request parameter like 
> "spark=mesos,16,32".  If that is omitted, we currently default to 
> "spark=local,1,1", which runs on a single core.  Instead we would like to 
> automatically determine the appropriate level of parallelism based on the 
> job's input data size.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] fgreg closed pull request #60: SDAP-151 Determine parallelism automatically for Spark analytics (#50)

2018-11-30 Thread GitBox
fgreg closed pull request #60: SDAP-151 Determine parallelism automatically for 
Spark analytics (#50)
URL: https://github.com/apache/incubator-sdap-nexus/pull/60
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] fgreg opened a new pull request #60: SDAP-151 Determine parallelism automatically for Spark analytics (#50)

2018-11-30 Thread GitBox
fgreg opened a new pull request #60: SDAP-151 Determine parallelism 
automatically for Spark analytics (#50)
URL: https://github.com/apache/incubator-sdap-nexus/pull/60
 
 
   * Removed spark configuration, added nparts configuration, and autocompute 
parallelism for spark-based time series.
   
   * SDAP-151 Determine parallelism automatically for Spark analytics


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services