[
https://issues.apache.org/jira/browse/OPENNLP-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15660268#comment-15660268
]
Tristan Nixon commented on OPENNLP-857:
---
I'm not seeing this in the trunk code, where should I look
[
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15644259#comment-15644259
]
Tristan Nixon commented on OPENNLP-776:
---
I've been swamped with other work, but I should be able
[
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15545918#comment-15545918
]
Tristan Nixon edited comment on OPENNLP-776 at 10/4/16 4:44 PM:
Sorry, I
[
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15545918#comment-15545918
]
Tristan Nixon commented on OPENNLP-776:
---
Sorry, I probably should have removed that older patch
[
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15545791#comment-15545791
]
Tristan Nixon commented on OPENNLP-776:
---
Well, it's a bit of a messy type hierarchy, since
[
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428454#comment-15428454
]
Tristan Nixon commented on OPENNLP-776:
---
Good point. I thought the only way to provide custom
[
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tristan Nixon updated OPENNLP-776:
--
Attachment: serialization_proxy.patch
Patch containing modifications to model classes
[
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412332#comment-15412332
]
Tristan Nixon commented on OPENNLP-776:
---
This pattern is quite common in frameworks that manage
[
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tristan Nixon updated OPENNLP-776:
--
Attachment: (was: externalizable.patch)
> Model Objects should be Serializa
[
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tristan Nixon updated OPENNLP-776:
--
Attachment: externalizable.patch
Also model classes can't be final if we're going to inherit
[
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tristan Nixon updated OPENNLP-776:
--
Attachment: externalizable.patch
Actually, there is one more thing that must happen
[
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tristan Nixon updated OPENNLP-776:
--
Attachment: (was: BaseModel-serialization.patch)
> Model Objects should be Serializa
[
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383308#comment-15383308
]
Tristan Nixon commented on OPENNLP-776:
---
Finally returning to this after more than a year. I'm
[
https://issues.apache.org/jira/browse/OPENNLP-857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tristan Nixon updated OPENNLP-857:
--
Attachment: ParserToolTokenize.patch
My patch
> ParserTool should take use Tokenizer insta
Tristan Nixon created OPENNLP-857:
-
Summary: ParserTool should take use Tokenizer instance. It should
not use java.util.StringTokenizer
Key: OPENNLP-857
URL: https://issues.apache.org/jira/browse/OPENNLP-857
o's are available? (Not sure if non-local files, such as HDFS, are
> supported)
>
> On Mon, Mar 14, 2016 at 2:12 PM, Tristan Nixon <st...@memeticlabs.org
> <mailto:st...@memeticlabs.org>> wrote:
> > What build system are you using to compile your code?
> > If you u
What build system are you using to compile your code?
If you use a dependency management system like maven or sbt, then you should be
able to instruct it to build a single jar that contains all the other
dependencies, including third-party jars and .so’s. I am a maven user myself,
and I use the
in, but with its
troublesome scalap dependency removed.
> On Mar 11, 2016, at 6:34 PM, Vasu Parameswaran <vas...@gmail.com> wrote:
>
> Added these to the pom and still the same error :-(. I will look into sbt as
> well.
>
>
>
> On Fri, Mar 11, 2016
So I think in your case you’d do something more like:
val jsontrans = new
JsonSerializationTransformer[StructType].setInputCol(“event").setOutputCol(“eventJSON")
> On Mar 11, 2016, at 3:51 PM, Tristan Nixon <st...@memeticlabs.org> wrote:
>
> val jsontrans = new
>
I recommend you package all your dependencies (jars, .so’s, etc.) into a single
uber-jar and then submit that. It’s much more convenient than trying to manage
including everything in the --jars arg of spark-submit. If you build with maven
than the shade plugin will do this for you nicely:
You must be relying on IntelliJ to compile your scala, because you haven’t set
up any scala plugin to compile it from maven.
You should have something like this in your plugins:
net.alchim31.maven
scala-maven-plugin
scala-compile-first
process-resources
compile
into a JSON-formatted string.
* Created by Tristan Nixon <tris...@memeticlabs.org> on 3/11/16.
*/
class JsonSerializationTransformer[T](override val uid: String)
extends UnaryTransformer[T,String,JsonSerializationTransformer[T]]
{
def this() = this(Identifiable.ran
ntId: string (nullable = true)
>
>
>
> I want to transform the Column event into String (formatted as JSON).
>
> I was trying to use udf but without success.
>
>
> On Fri, Mar 11, 2016 at 1:53 PM Tristan Nixon <st...@memeticlabs.org
> <mailto:st...@meme
Have you looked at DataFrame.write.json( path )?
https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrameWriter
> On Mar 11, 2016, at 7:15 AM, Caires Vinicius wrote:
>
> I have one DataFrame with nested StructField and I want to convert to
Hear, hear. That’s why I’m here :)
> On Mar 10, 2016, at 7:32 PM, Chris Fregly wrote:
>
> Anyway, thanks for the good discussion, everyone! This is why we have these
> lists, right! :)
Very interested, Evan, thanks for the link. It has given me some food for
thought.
I’m also in the process of building a web application which leverage Spark on
the back-end for some heavy lifting. I would be curious about your thoughts on
my proposed architecture:
I was planning on running a
dn't know whether I'm running it as super user.
>
> I have java version 1.8.0_73 and SCALA version 2.11.7
>
> Sent from my iPhone
>
>> On 9 Mar 2016, at 21:58, Tristan Nixon <st...@memeticlabs.org> wrote:
>>
>> That’s very strange. I just un-set m
Hmmm… that should be right.
> On Mar 10, 2016, at 11:26 AM, Ashic Mahtab wrote:
>
> src/main/resources/log4j.properties
>
> Subject: Re: log4j pains
> From: st...@memeticlabs.org
> Date: Thu, 10 Mar 2016 11:08:46 -0600
> CC: user@spark.apache.org
> To: as...@live.com
>
> Where
Where in the jar is the log4j.properties file?
> On Mar 10, 2016, at 9:40 AM, Ashic Mahtab wrote:
>
> 1. Fat jar with logging dependencies included. log4j.properties in fat jar.
> Spark doesn't pick up the properties file, so uses its defaults.
It really shouldn’t, if anything, running as superuser should ALLOW you to bind
to ports 0, 1 etc.
It seems very strange that it should even be trying to bind to these ports -
maybe a JVM issue?
I wonder if the old Apple JVM implementations could have used some different
native libraries for
That’s very strange. I just un-set my SPARK_HOME env param, downloaded a fresh
1.6.0 tarball,
unzipped it to local dir (~/Downloads), and it ran just fine - the driver port
is some randomly generated large number.
So SPARK_HOME is definitely not needed to run this.
Aida, you are not running
ld launch the scripts defaults to a
> single machine(local host)
>
> Sent from my iPhone
>
>> On 9 Mar 2016, at 19:59, Tristan Nixon <st...@memeticlabs.org> wrote:
>>
>> Also, do you have the SPARK_HOME environment variable set in your shell, and
>> if so what
Also, do you have the SPARK_HOME environment variable set in your shell, and if
so what is it set to?
> On Mar 9, 2016, at 1:53 PM, Tristan Nixon <st...@memeticlabs.org> wrote:
>
> There should be a /conf sub-directory wherever you installed spark, which
> contains several c
Tristan, thanks for your message
>
> When I look at the spark-defaults.conf.template it shows a spark
> example(spark://master:7077) where the port is 7077
>
> When you say look to the conf scripts, how do you mean?
>
> Sent from my iPhone
>
>> On 9 Mar 2016, a
Yeah, according to the standalone documentation
http://spark.apache.org/docs/latest/spark-standalone.html
the default port should be 7077, which means that something must be overriding
this on your installation - look to the conf scripts!
> On Mar 9, 2016, at 1:26 PM, Tristan Nixon
Looks like it’s trying to bind on port 0, then 1.
Often the low-numbered ports are restricted to system processes and
“established” servers (web, ssh, etc.) and
so user programs are prevented from binding on them. The default should be to
run on a high-numbered port like 8080 or such.
What do
You can also package an alternative log4j config in your jar files
> On Mar 9, 2016, at 12:20 PM, Ashic Mahtab wrote:
>
> Found it.
>
> You can pass in the jvm parameter log4j.configuration. The following works:
>
> -Dlog4j.configuration=file:path/to/log4j.properties
>
> It
ct().
>
> Sorry I am new to spark and I am just messing around with it.
>
> On Mar 8, 2016 10:23 PM, "Tristan Nixon" <st...@memeticlabs.org
> <mailto:st...@memeticlabs.org>> wrote:
> My understanding of the model is that you’re supposed to execute
>
this is a bit strange, because you’re trying to create an RDD inside of a
foreach function (the jsonElements). This executes on the workers, and so will
actually produce a different instance in each JVM on each worker, not one
single RDD referenced by the driver, which is what I think you’re
My understanding of the model is that you’re supposed to execute
SparkFiles.get(…) on each worker node, not on the driver.
Since you already know where the files are on the driver, if you want to load
these into an RDD with SparkContext.textFile, then this will distribute it out
to the
why the error is happening.
>
> On Mon, Mar 7, 2016 at 5:55 PM, Tristan Nixon <st...@memeticlabs.org
> <mailto:st...@memeticlabs.org>> wrote:
> I’m not sure I understand - if it was already distributed over the cluster in
> an RDD, why would you want to collect and then
il.com> wrote:
>
> Hi Tristan,
>
> This is not static, I actually collect it from an RDD to the driver.
>
> On Mon, Mar 7, 2016 at 5:42 PM, Tristan Nixon <st...@memeticlabs.org
> <mailto:st...@memeticlabs.org>> wrote:
> Hi Arash,
>
> is this static d
[
https://issues.apache.org/jira/browse/TIKA-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14622982#comment-14622982
]
Tristan Nixon commented on TIKA-1362:
-
Storing the API key in the properties file
[
https://issues.apache.org/jira/browse/TIKA-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14622999#comment-14622999
]
Tristan Nixon commented on TIKA-1362:
-
Great to hear, and thanks for the invite. I'm
[
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tristan Nixon updated OPENNLP-776:
--
Attachment: model-constructors.patch
I realized that for automatic de-serialization, all
[
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550561#comment-14550561
]
Tristan Nixon commented on OPENNLP-776:
---
You're totally welcome! Let me know when
[
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550604#comment-14550604
]
Tristan Nixon commented on OPENNLP-776:
---
It does not make the (de-)serialization
[
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tristan Nixon updated OPENNLP-776:
--
Attachment: BaseModel-serialization.patch
My patch
Model Objects should be Serializable
Tristan Nixon created OPENNLP-776:
-
Summary: Model Objects should be Serializable
Key: OPENNLP-776
URL: https://issues.apache.org/jira/browse/OPENNLP-776
Project: OpenNLP
Issue Type
[
https://issues.apache.org/jira/browse/SPARK-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517886#comment-14517886
]
Tristan Nixon commented on SPARK-4414:
--
Thanks, [~petedmarsh], I was having this same
Hello all,
I have a question regarding the way in which SA deals
with whitelisting blacklisting. If I want to whitelist
all but a few select entries from a domain, how would I do it.
Should the following work?
whitelist_from [EMAIL PROTECTED]
unwhitelist_from [EMAIL PROTECTED]
unwhitelist_from
51 matches
Mail list logo