gt; of the PMC.
Another volunteer here :)
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Li
ilt-in? If so, it
might be useful to have a button that uses the built-in service where
Android makes it available.
My main wish for Android would be support for HFST's dictionaries; I
read
http://lists.puimula.org/pipermail/libvoikko/2012-August/000442.html as
HFST being used for spellers on And
rom now on
@D.NUMxrel.pres@ means: only allow this analysis if the 'NUMxrel'
feature is 'pres'
These features are called "flag diacritics" for no good reason. They can
be useful for expressing long-distance relationships.
I don't know what @D.lim@ does,
rcadaMikel L. Forcada
> 15 nordfalkJacob Nordfalk
> 16 sanmarfFelipe Sánchez Martínez
> 17 selimcanIlnar Salimzyan
> 18 sortizSergio Ortiz
> 19 spectre360Francis Tyers
> 20 tunedalPer Tunedal
> 21 unhammerKevin Brubeck Unhammer
> 22 xavivars Xavi I
regular lttoolbox and run
$ lt-proc -c -t /home/karunakar/scl/skt_gen/bin/skt_gen.bin < input
?
Do you have a UTF-8 locale installed and set? What does
$ export|grep LANG
give ?
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
-
el has pointed out on IRC, the decision is the
> election board's.
It didn't occur to me that we have committers who don't follow
apertium-stuff, guess I was wrong :) Jernej agreed to keeping the census
open a bit more, I do too – Xavi?
Should we say until the 16th?
So s
Francis Tyers writes:
> El dt 09 de 04 de 2013 a les 22:20 +0200, en/na Kevin Brubeck Unhammer
> va escriure:
>> So sf.net considers it spam to send messages to all committers, but is
>> there any other place than IRC we should be spamming^Wmessaging?
>
> If you prepare
o
./configure --enable-utf
before compiling, does that help?
[...]
> For the other task I still have to learn a bit more about how is
> structured the TSX files, and how the rules (grammar constraints) are
> build
http:/
Final call to add yourselves to
http://wiki.apertium.org/wiki/PMC_election#Candidates !
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
--
Precog is a next-generation analytics platform capable of advanced
analytics on semi-str
da, which is processed by an XSLT script into two
different bidixes before compilation. Most entries would be the same,
but some would be marked nn-only or nb-only. This kind of thing happens
in a lot of apertium pairs, and should be no trouble to set up.
--
Kevin Brubeck Unhammer
Written wi
7;t work before adding new stuff (actually, I would first run a big
corpus through the CG with --trace and delete any unused rules, to make
it easier to deal with). But if you work on no→da first, the da CG would
not be useful yet.
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
-
+1, if it's possible to give several people the responsibility of
approving so that registrations happen quickly even if people go off the
grid (I've been waiting for confirmation for two years on the
thinkpadwiki :-/).
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
---
"Bernard Chardonneau"
writes:
>> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)
>> Date: Thu, 06 Jun 2013 15:38:27 +0200
>> From: Kevin Brubeck Unhammer
>> To: apertium-stuff@lists.sourceforge.net
>> Reply-To: apertium-stuff@lists.sourcef
> in its tracks on message boards I have run in the past (dramatic difference
> between dozens of posts to clean up daily versus one or two weekly).
stopforumspam.com seems to be down, but
https://www.mediawiki.org/wiki/Manual:Combating_spam#IP_address_blacklists
User:David_Nemeskey/CG_XML_brainstorming
?
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
--
This SF.net email is sponsored by Windows:
Build for Windows Store.
http://p.sf.net/sfu/windows-dev2dev
___
are XML id's; I wish they weren't, since for some reason XML id's are
terribly limited in what characters they can contain, e.g. neither @, →,
←, $, nor &entity; are allowed (CG tags need to be able to have @ or →
in them).
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
--
ion would be to do what is intuitive: to allow as a
> parameter to macros. The correct++ resolution would be to allow
> anything that can appear in to be used as a parameter to a
> macro.
Is there a good reason why we can't pass anything to a macro?
If people try to pass
Kevin Brubeck Unhammer writes:
> "Jimmy O'Regan"
> writes:
>
>> [Readding list cc]
>>
>> On 13 July 2013 07:12, Mikel Forcada
>> wrote:
>>> Sergio, Jimmy, all:
>>>
>>> Thanks for your help. I am, however, still a b
in nb-nn to turn it into "vere" since
"befinne seg" doesn't really work in nn.
(The pardef needs more work if you ever want to keep the reflexive
pronoun, as the TODO comment tries to say.)
--
Kevin Brubeck Unhammer
Francis Tyers writes:
> Hello Apertiumers!
>
> I'd like to canvas opinions on creating a new top-level SVN module for
> monolingual language packs.
+1
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpGrICcJIYI1.pgp
Descripti
it's time to
> move on to more meaningful texts. That's why I believe it would be a
> help in developing new language pairs.
It's hardly natural language though. You'll perhaps get a good
translator of block world texts, but how often d
requent outside the block world corpus? How did
you come by that information? If you've already got a real frequency
list, it's a waste of time to make a new one from text that you know is
not natural.
The same argument goes for grammatical constructions.
--
Kevin Brubeck Unhammer
h
Francis Tyers writes:
> Hello all,
>
> We just got a project funded for Finnish-Estonian and Finnish-North
> Sámi!
>
> Go Apertium! :D
Congrats =D
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpRhYafSM2aa.pgp
Descript
in order to make a tarball. This can be uploaded to sourceforge, but
perhaps you want to increase the version number first? See
http://wiki.apertium.org/wiki/Making_a_release
--
Kevin Brubeck Unhammer
GPG: 0x766AC60
Tihomir Rangelov
writes:
> On 6.9.2013, at 08:30, Kevin Brubeck Unhammer wrote:
>> apertium.org runs only released tarballs. I made some minor changes to
>> the makefile so you should be able to type "./autogen.sh && make dist"
>> in order to make a tarba
eason that it doesn't work as is.
>
> Fran
One confusing thing is that it when null_flush is true, transfer calls
transfer_wrapper_null_flush which then calls transfer (now with
null_flush false, but internal_null_flush true).
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpQIn5JHPRZo.pgp
Descripti
Tihomir Rangelov
writes:
> Kærar þakkir!
> When does it appear on the web service?
Someone with ssh access needs to install it; I don't know who currently
has access – anyone?
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpUgJ2WtT3Ba.pgp
Description: PG
Kevin Brubeck Unhammer writes:
> Francis Tyers writes:
>
>> Hello all,
>>
>> Can anyone confirm that null flush (-z) option in transfer works with
>> the -b (no bidix) option ?
[...]
Mistook you to mean "lt-proc -b", but apertium-transfer -z -b also w
;
morpho_stream.setEndOfFile(false);
}
I'm not 100 % sure if "alpha[0][eos] = 1;" is needed. It seems to give
the same result without it too, but it is set when starting the tagging
and doesn't hurt so I included it.
Felipe (or anyone else who understands the HMM): d
output | dwdiff --color --diff-input |
less
$ diff -U1 old-da-sv-output new-da-sv-output | dwdiff --color --diff-input |
less
[1] Unless you have an army of competent human translators happy to
post-edit lots of texts for you every time you make a change just so
you can run WER.
--
Kev
search?searchon=names&keywords=dwdiff
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpFJs7mrwWJS.pgp
Description: PGP signature
--
How ServiceNow helps IT people transform IT departments:
1. Consolidate legacy IT systems
ur Debian has a different version of it than my Ubuntu (dwdiff
2.0.4). Anyway, you can do
$ dwdiff -C1 --color old-sv-da-output new-sv-da-output | less
to get the same effect (it's just a bit slower with huge files in my
experience).
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgp
d, not the "diff -U1 | dwdiff" version. Or, with your blockworld
files, you run
$ dwdiff -C1 --color blockworld.evaluation.da
blockworld.evaluation.sv.translated | less
If there is no difference, it should not show any output (this happened
when I tried it on the very little Danish text I
t;
> Wikipedia very rarely has good translations between languages.
Fortunately, training the tagger doesn't require parallel text, just
monolingual text.
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpZ_Es9K5PEL.pgp
Description: PGP signature
contribute (sometime in the summer 2012?) with the most recent version.
Just "svn up -rN" where N is your first commit. Do
$ svn log > thelog
to download the commit log to find your first commit.
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgp1ZhKs4x
ux terminal to read, they should show up as green and red.
Maybe your "less" command is deficient and does not show colour? You
could try without "|less" if your terminal lets you scroll anyway.
--
Kevin Brubeck Un
#x27;ve just
replaced .prob files without changing their file names, then no, you
don't need new modes files.)
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpeOzikhBOWh.pgp
Description: PGP signature
--
LIMITED TI
^blå/blå/blå$
> ^kon/ko/kon/ko/ko/kon/kon$
> ^./.$
If you want to test this properly, use "lt-proc -e", since your
dictionary has compounding tags on it. The tags /
should never appear in output.
--
^ta$
lt-proc with bidix gives:
^ta/take/get/grab$
lrx-proc (lexical selection) gives:
^grab$
and that is passed on to transfer rules (which ensure you write "grab
the cone" and not just "grab cone", but they are not responsible fo
hological
disambiguation includes choosing noun vs verb (English
^fly/fly/fly/fly$), choosing different lemmas of nouns
(Swedish ^kon/ko/kon$), or even different analyses of the same
lemma+PoS (English ^sheep/sheep/sheep$).
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pg
> Or am I wrong? Does the tagger work more like a language model i.e. take
> the actual words into account?
You can put lemmas in the TSX if you want (example:
http://wiki.apertium.org/wiki/Tagger_training#Writing_a_TSX_file ), but
if I understand correctly it won't take forms/lemmas
kistán$}$^punt{^.$}$
$ echo between Kazakhstan and outer space | apertium -d . en-es-chunker
^Pr{^entre$}$ ^nom{^Kazajistán<3><4>$}$
^cnj{^y$}$ ^nom_adj{^espacio<3><4>$
^exterior<4>$}$^punt{^.$}$
What's the use case?
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
1] https://www.flameeyes.eu/autotools-mythbuster/pkgconfig/index.html
[2]
https://www.flameeyes.eu/autotools-mythbuster/pkgconfig/pkg_check_modules.html
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgp7Spa0VAIFq.pgp
Description: PGP signature
---
Francis Tyers writes:
> El dt 15 de 10 de 2013 a les 11:47 +0200, en/na Kevin Brubeck Unhammer
> va escriure:
[...]
>> If we rename the .pc files, all language pairs need one change to
>> configure.ac, but hopefully this will be the last one they need :-)
>
> I think t
context b9:b >: i: b9:b _ X3: j:i
WARNING! The conflict is unresolvable.
There is a <=-rule conflict between "e7 Rising" and "Illative Singular
Metaphony2 SUBCASE: Vx=e7 Vy=á".
E.g. in context b9:b i: b9:b _ W9: W6: j:i
usr/local/bin:${PATH}
in your ~/.bashrc (or ~/.profile or ~/.bash_profile or whichever file
you use), opening a new terminal, and trying again. This should make
/usr/bin preferred over /usr/local/bin.
If that doesn't work, can you try
$ /usr/local/bin/pe
use5005threads=undef useithreads=undef
useithreads=undef means /usr/local/bin/perl can't be used.
[...]
> root@host [~]# /usr/bin/perl -V // sale lo mismo que anterior
Exactamente lo mismo? Most importantly, did /usr/bin/perl -V also say
"useithreads=undef"?
(The perl on my
bit more zoomed
out, there a verb preposition tag might make us change the preposition
chunk.
(If I were to do it again, I would probably wait with inserting the
preposition chunk until t2x. That is, t1x would add e.g. to the
noun chunk and to the verb chunk, while t2x would decide which
one to us
you
would have to check with them.
If none of that works, http://stackoverflow.com/a/19063960/69663
suggests editing /usr/share/automake-1.11/Automake/Config.pm to say
our $perl_threads = 0;
instead of "our $perl_threads = 1;" – it smells like a hack, but it may
work (if you have root).
> should really either update that installer, or remove it.
Seems like it needs an update, yes:
http://www.google-melange.com/gci/task/view/google/gci2013/6396457749839872
http://superuser.com/a/628401
Does the source for that installer exi
"Jimmy O'Regan"
writes:
> On 21 November 2013 15:05, Kevin Brubeck Unhammer
> wrote:
>> "Jimmy O'Regan"
>> writes:
>>
>>> I'm not 100% about this, but there was a problem with Cygwin recently
>>> - IIRC, certain
demo app" for other app
developers to build on; but I think the plan was to have a separate,
more fancy app, so your ideas could certainly go into a new app based on
the current one. But then I think we need some mentors who know the
android stuff to hang out on #apertium (hint, hint).
[
; one.
+1
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpOwUiqR6_y9.pgp
Description: PGP signature
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfall
one, run it through OCR
and Apertium on the server and send back the translation in Kazakh. And
the web page would run not only on iOS.
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpqu4U26XBub.pgp
Description: PGP signature
--
ll get to try it out.
If the powers that be accept the code, I can merge this into lttoolbox
(without all the #ifdef DEBUG statements =P).
[1]
http://wiki.apertium.org/wiki/Talk:Automatically_trimming_a_monodix#.23-type_multiwords
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpT4ew43_rwV.
Kevin Brubeck Unhammer writes:
> There is one caveat: (group element) is not handled yet. The man
> page notes how to work around that. I have an idea for handling the
> group element[1], but I'm not sure when I'll get to try it out.
With the latest version on
https://
Kevin Brubeck Unhammer writes:
> Regarding multiwords with both and at once, I'm not even sure
> what those should look like. But when I tested with ca-en they seem to
> be included, except these inf+es entries which keeps getting trimmed
> out:
>
> canvi
it with a warning that it's still considered
experimental).
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpxXfJ9tVnRp.pgp
Description: PGP signature
--
Android apps run on BlackBerry 10
Introducing the new BlackBerry
typically used for marking language variants. E.g. in an English dix,
you might have alt="en_GB" on the entry for "colour", and alt="en_US" on
the entry for "color". Then the alt.xsl script called with en_GB as the
option would include only those entries that h
ed fine
with the html-noent formatter. The guy wanted to translate Android core
resources into Nynorsk; however, he never found out how to contribute
his translations to the Android project (and neither did I) so that died
down pretty quickly.
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpCJbGm
Gaurav Agrawal
writes:
> Hello Kevil,
>
> Thanks for answering my queries, I have some further doubts.
>
> Thanks and Regards,
> Gaurav Agrawal
> M.Tech CSE
> IIIT, Hyderabad
>
> IRC #ergaurav2
>
> On Tue, Mar 11, 2014 at 1:36 PM, Kevin Brubeck Unhammer
>
"Jim O'Regan" writes:
> On 11 March 2014 08:12, Kevin Brubeck Unhammer
> wrote:
>> Mikel Artetxe writes:
>>
>> [...]
>>
>>> 3) The real work to do would be writing a deformatter/reformatter for
>>> string resources, which use
mputer science at Pennsylvania State
> University and last semester I had an NLP class. I'm familiar with the
> input normalization task. I speak English, Arabic and French very
> fluently.
>
> Is there any specific language that I should target?
All of them :-) but perhaps s
-> birthday. So we
> have to handle them separately. And to do this first we have to
> classify them.
>
> For translation I have just written a bash script
> while read line; do echo $line | apertium en-eo; done < Tweets
That would run much faster as
apertium en-eo &l
od idea,
you should certainly do it, but it's not really a deliverable.
[1] http://wiki.apertium.org/wiki/Easy_dictionary_maintenance
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpkc1AliCUOM.pgp
Description: PGP signature
--
Francis Tyers writes:
> Hello everyone,
>
> I think it is time to do a new release of apertium and lttoolbox.
+1
--
Kevin Brubeck Unhammer
Sent from my emacs
pgpjvfHp74OTy.pgp
Description: PGP signature
---
hunk.dtd tagger.dtd
transfer.dtd
- acx.rng modes.rng transfer.rng
(I would prefer to rm {dix,acx}.* from apertium, and otherwise either
have one format for all files, or both formats for all files.)
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgppXZjUoggkr.pgp
Description: PGP signature
-
"Jim O'Regan" writes:
> On 25 March 2014 13:14, Kevin Brubeck Unhammer wrote:
[...]
>> While on the subject of things that could be sorted out before a
>> release, why are some XML validation files .rng and some .dtd? And why
>> does apertium
"Jim O'Regan" writes:
> On 25 March 2014 22:15, Kevin Brubeck Unhammer wrote:
>> "Jim O'Regan" writes:
[...]
>> So it should be fine to just change
>>
>> @echo "$(XMLLINT) --dtdvalid $(apertiumdir)/dix.dtd --noout
>
Kevin Brubeck Unhammer writes:
> "Jim O'Regan" writes:
>
>> On 25 March 2014 22:15, Kevin Brubeck Unhammer
>> wrote:
>>> "Jim O'Regan" writes:
>
> [...]
>
>>> So it should be fine to just change
>>>
>&g
ekend:
apertium-kaz-tat 0.2.1
apertium-sme-nob-0.5.1
apertium-ca-it 0.1.1
apertium-eu-es-0.3.3
Apart from the mentioned syntax errors, I haven't touched dictionaries,
only Makefile.am and configure.ac, so they're as testvoqued as they were
the previous release.
--
Kevin Brubeck Unhammer
script to do
0El1 …
instead of
[0]El[1]
and call apertium with the "-f html-noent" format option instead of "-f
none" (since the latter will break as soon as you get any "special
symbol" as input).
--
Kevin Bru
e'
>
> with the links reordered.
Ah, then numbered superblanks should _not_ work, we try not to reorder
superblanks since it could mess up HTML.
(Tino Didriksen suggested a solution involving treating some blanks as
glued to the words and some as non-reorderable, but no one's tried
i
the corpus from Wikipedia dumps. Hence, we're looking
> for help with this
>
> For more details, check out this link:
> http://piratepad.net/LanguagesTested or contact me/sushain. Thanks!
Do you just need a plain text corpus in various languages?
--
Kevin Brubeck Unhammer
GPG
I just committed a little bugfix to trunk/apertium (reported by Vee /
v21 on IRC, thanks!). On non-ASCII input, the reformatters would eat
memory until they dropped. If you run apertium on large corpora and have
memory issues you might want to update :-)
--
Kevin Brubeck Unhammer
GPG
://wiki.apertium.org/wiki/Reordering_superblanks
The sketched solution seems to me like it should deal with all of the
above issues. Comments please :-)
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpkKJiGCDTWc.pgp
Description: PGP signature
-
C to discuss it. If upper management
agrees, we can probably get you commit access.
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpyl4WzldZNf.pgp
Description: PGP signature
--
HPCC Systems Open Source Big Data Platform fr
use lt-proc for non-Apertium things. But it would require changing
modes files for any pairs that want to take advantage of it … I think
maybe a hardcoded ignore-list in lttoolbox would be more helpful to more
users. Are there other use-cases than soft-hyphens? Or cases where we
want to _not_ ignore the s
from ScaleMT that might affect things is that APY uses
three-letter language codes (ISO-639-3); I have no idea how to debug it
in the mediawiki thing though.
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpqZqBrHAqfm.pgp
Description: PGP signature
-
rtium/files/lttoolbox/
and https://sourceforge.net/projects/apertium/files/apertium/
--
Kevin Brubeck Unhammer
GPG: 0x766AC60C
pgpWfQ4bNJh4K.pgp
Description: PGP signature
--
Open source business process management suite bu
501 - 580 of 580 matches
Mail list logo