Re: Hive UDF accessing https request
Sergey, Yes there may be some difference between Hive Task and the separate program and so Hive Task could not able to access https request. Gopal, We have ca-certificates installed and there is no java subdirectory in /etc/ssl/certs folder. ls -ltr /etc/ssl/certs/java/ - ls: cannot access /etc/ssl/certs/java/: No such file or directory - We don't have any dir "/etc/ssl/certs/java/" we have dir till "/etc/ssl/certs". There is no Java dir here. => rpm -qa | grep cert - ca-certificates-2013.1.95-65.1.el6_5.noarch ca-certificates-2014.1.98-65.1.el6.noarch - Can you guys help me to check whether sample https request is accessed from Hive UDF to ensure whether any configuration issue from my end or a Bug in Hive. Thanks, Prabhu Joseph On Mon, Jan 11, 2016 at 11:53 PM, Sergey Shelukhin wrote: > Hmm, I’ve no idea off the top of my head what this exception means. > My guess is something is different about the environment in which the Hive > task vs the separate program is running. Different machine, different user, > different Java args, path, not sure. It probably cannot find some Java SSL > thing, e.g. a truststore, from the Hive task, or doesn’t have access to it. > > > From: Prabhu Joseph > Reply-To: "u...@hive.apache.org" > Date: Sunday, January 10, 2016 at 22:06 > To: "u...@hive.apache.org" > Cc: "dev@hive.apache.org" > Subject: Re: Hive UDF accessing https request > > Thanks Sergey for looking into this. > > Below is the Exception we are getting when we use from Hive UDF, but from > separate java program it works fine > > javax.net.ssl.SSLHandshakeException: > sun.security.validator.ValidatorException: PKIX path building failed: > sun.security.provider.certpath.SunCertPathBuilderException: unable to find > valid certification path to requested target > at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) > at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1884) > at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:276) > at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:270) > at > sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1341) > at > sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:153) > at sun.security.ssl.Handshaker.processLoop(Handshaker.java:868) > at sun.security.ssl.Handshaker.process_record(Handshaker.java:804) > at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1016) > at > sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1312) > at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1339) > at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1323) > at > sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:563) > at > sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185) > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1300) > at > sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:254) > at com.network.logs.udf.ProfoundNew.evaluate(ProfoundNew.java:30) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:1219) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:182) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:79) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793) > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540) > at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.j
Re: Hive UDF accessing https request
Hmm, I’ve no idea off the top of my head what this exception means. My guess is something is different about the environment in which the Hive task vs the separate program is running. Different machine, different user, different Java args, path, not sure. It probably cannot find some Java SSL thing, e.g. a truststore, from the Hive task, or doesn’t have access to it. From: Prabhu Joseph mailto:prabhujose.ga...@gmail.com>> Reply-To: "u...@hive.apache.org<mailto:u...@hive.apache.org>" mailto:u...@hive.apache.org>> Date: Sunday, January 10, 2016 at 22:06 To: "u...@hive.apache.org<mailto:u...@hive.apache.org>" mailto:u...@hive.apache.org>> Cc: "dev@hive.apache.org<mailto:dev@hive.apache.org>" mailto:dev@hive.apache.org>> Subject: Re: Hive UDF accessing https request Thanks Sergey for looking into this. Below is the Exception we are getting when we use from Hive UDF, but from separate java program it works fine javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1884) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:276) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:270) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1341) at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:153) at sun.security.ssl.Handshaker.processLoop(Handshaker.java:868) at sun.security.ssl.Handshaker.process_record(Handshaker.java:804) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1016) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1312) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1339) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1323) at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:563) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1300) at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:254) at com.network.logs.udf.ProfoundNew.evaluate(ProfoundNew.java:30) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:1219) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:182) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:79) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1469) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:385) at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:292) at sun.security.validator.Validator.validate(Validator.java:260) at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:326) at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:231) at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:126) at sun.security.ssl.ClientHandsh
Re: Hive UDF accessing https request
> javax.net.ssl.SSLHandshakeException: >sun.security.validator.ValidatorException: PKIX path building failed: >sun.security.provider.certpath.SunCertPathBuilderException: unable to >find valid certification path to requested There's a linux package named ca-certificates(-java) which might be missing. You can see what's in /etc/ssl/certs/java/ & make sure you have them. Running external request workloads are not recommended - because for each failure, all previous results are discarded. Each retry will start from the 1st item and in general be wasteful & slow. Cheers, Gopal
Re: Hive UDF accessing https request
..@hive.apache.org" , "dev@hive.apache.org" < > dev@hive.apache.org> > Subject: Hive UDF accessing https request > > Hi Experts, > >I am trying to write a Hive UDF which access https request and based on > the response return the result. From Plain Java, the https response is > coming but the https accessed from UDF is null. > > Can anyone review the below and share the correct steps to do this. > > > create temporary function profoundIP as 'com.network.logs.udf.ProfoundIp'; > > select ip,profoundIP(ip) as info from r_distinct_ips_temp; > //returns NULL > > > //Below UDF program > > package com.network.logs.udf; > > import java.io.BufferedReader; > import java.io.InputStreamReader; > import java.net.URL; > > import javax.net.ssl.HttpsURLConnection; > > import org.apache.hadoop.hive.ql.exec.UDF; > import org.apache.hadoop.io.Text; > > public class ProfoundNew extends UDF { > > private Text evaluate(Text input) { > > String url = "https://api2.profound.net/ip/"; + input.toString() > +"?view=enterprise"; > > URL obj; > try { > obj = new URL(url); > > HttpsURLConnection con = (HttpsURLConnection) obj.openConnection(); > > con.setRequestMethod("GET"); > con.setRequestProperty("Authorization","ProfoundAuth > apikey=cisco-065ccfec619011e38f"); > > int responseCode = con.getResponseCode(); > > BufferedReader in = new BufferedReader(new > InputStreamReader(con.getInputStream())); > String inputLine; > StringBuffer response = new StringBuffer(); > > while ((inputLine = in.readLine()) != null) { > response.append(inputLine); > } > in.close(); > return new Text(response.toString()); > } catch (Exception e) { > e.printStackTrace(); > } > return null; > > } > } > > > > Thanks, > Prabhu Joseph > >
Re: Hive UDF accessing https request
To start with, you can remove the try-catch so that the exception is not swallowed and you can see if an error occurs. However, note that this is an anti-pattern for any reasonable-sized dataset. From: Prabhu Joseph mailto:prabhujose.ga...@gmail.com>> Reply-To: "u...@hive.apache.org<mailto:u...@hive.apache.org>" mailto:u...@hive.apache.org>> Date: Friday, January 8, 2016 at 00:51 To: "u...@hive.apache.org<mailto:u...@hive.apache.org>" mailto:u...@hive.apache.org>>, "dev@hive.apache.org<mailto:dev@hive.apache.org>" mailto:dev@hive.apache.org>> Subject: Hive UDF accessing https request Hi Experts, I am trying to write a Hive UDF which access https request and based on the response return the result. From Plain Java, the https response is coming but the https accessed from UDF is null. Can anyone review the below and share the correct steps to do this. create temporary function profoundIP as 'com.network.logs.udf.ProfoundIp'; select ip,profoundIP(ip) as info from r_distinct_ips_temp; //returns NULL //Below UDF program package com.network.logs.udf; import java.io.BufferedReader; import java.io.InputStreamReader; import java.net.URL; import javax.net.ssl.HttpsURLConnection; import org.apache.hadoop.hive.ql.exec.UDF; import org.apache.hadoop.io.Text; public class ProfoundNew extends UDF { private Text evaluate(Text input) { String url = "https://api2.profound.net/ip/"; + input.toString() +"?view=enterprise"; URL obj; try { obj = new URL(url); HttpsURLConnection con = (HttpsURLConnection) obj.openConnection(); con.setRequestMethod("GET"); con.setRequestProperty("Authorization","ProfoundAuth apikey=cisco-065ccfec619011e38f"); int responseCode = con.getResponseCode(); BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream())); String inputLine; StringBuffer response = new StringBuffer(); while ((inputLine = in.readLine()) != null) { response.append(inputLine); } in.close(); return new Text(response.toString()); } catch (Exception e) { e.printStackTrace(); } return null; } } Thanks, Prabhu Joseph
Hive UDF accessing https request
Hi Experts, I am trying to write a Hive UDF which access https request and based on the response return the result. From Plain Java, the https response is coming but the https accessed from UDF is null. Can anyone review the below and share the correct steps to do this. create temporary function profoundIP as 'com.network.logs.udf.ProfoundIp'; select ip,profoundIP(ip) as info from r_distinct_ips_temp; //returns NULL //Below UDF program package com.network.logs.udf; import java.io.BufferedReader; import java.io.InputStreamReader; import java.net.URL; import javax.net.ssl.HttpsURLConnection; import org.apache.hadoop.hive.ql.exec.UDF; import org.apache.hadoop.io.Text; public class ProfoundNew extends UDF { private Text evaluate(Text input) { String url = "https://api2.profound.net/ip/"; + input.toString() +"?view=enterprise"; URL obj; try { obj = new URL(url); HttpsURLConnection con = (HttpsURLConnection) obj.openConnection(); con.setRequestMethod("GET"); con.setRequestProperty("Authorization","ProfoundAuth apikey=cisco-065ccfec619011e38f"); int responseCode = con.getResponseCode(); BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream())); String inputLine; StringBuffer response = new StringBuffer(); while ((inputLine = in.readLine()) != null) { response.append(inputLine); } in.close(); return new Text(response.toString()); } catch (Exception e) { e.printStackTrace(); } return null; } } Thanks, Prabhu Joseph