Re: SA bayes file db permission issue

2016-06-12 Thread RW
On Sun, 12 Jun 2016 00:04:49 -0500 (CDT)
Dave Funk wrote:

> On Sat, 11 Jun 2016, RW wrote:
> 
> > On Fri, 10 Jun 2016 15:38:44 -0400
> > Joseph Brennan wrote:
> >
> >  
> >> This is a nice test I found:
> >> echo -n I | od -to2 | awk '{ print substr($2,6,1); exit}'
> >>
> >> 1 little-endian
> >> 0 big-endian  
> >
> > I don't see how this can output anything other than 1.
> >
> > Endianness is about the addressing of bytes within integer words.
> > This is looking at the ordering of human-readable octal digits
> > displaying the contents of a single byte.  
> 
> On big-endian system:
> 
>$ echo -n I | od -to2
>000044400
>001
> 
> On little-endian system:
> 
># echo -n I | od -to2
>000000111
>001
> 
> So it works.
> It's a single data byte but since the display field is a two byte
> object, where within that two byte object does that single byte show
> up?

I don't use od much. FWIW, what I was missing is that od will have to
pad the input to get an even number of bytes, so it's effectively
working with "I\0".


Re: SA bayes file db permission issue

2016-06-11 Thread Dave Funk

On Sat, 11 Jun 2016, RW wrote:


On Fri, 10 Jun 2016 15:38:44 -0400
Joseph Brennan wrote:



This is a nice test I found:
echo -n I | od -to2 | awk '{ print substr($2,6,1); exit}'

1 little-endian
0 big-endian


I don't see how this can output anything other than 1.

Endianness is about the addressing of bytes within integer words. This
is looking at the ordering of human-readable octal digits displaying
the contents of a single byte.


On big-endian system:

  $ echo -n I | od -to2
  000044400
  001

On little-endian system:

  # echo -n I | od -to2
  000000111
  001

So it works.
It's a single data byte but since the display field is a two byte
object, where within that two byte object does that single byte show up?

--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: SA bayes file db permission issue

2016-06-11 Thread RW
On Fri, 10 Jun 2016 15:38:44 -0400
Joseph Brennan wrote:


> This is a nice test I found:
> echo -n I | od -to2 | awk '{ print substr($2,6,1); exit}'
> 
> 1 little-endian
> 0 big-endian

I don't see how this can output anything other than 1.

Endianness is about the addressing of bytes within integer words. This
is looking at the ordering of human-readable octal digits displaying
the contents of a single byte.


Re: SA bayes file db permission issue

2016-06-10 Thread Martin Gregorie
On Fri, 2016-06-10 at 15:38 -0400, Joseph Brennan wrote:

> Look out for big-endian and little-endian, too. That affects
> databases. 
> This bit us once when we copied a berkeley db from solaris to linux. 
> Endian-ness is based on the cpu hardware, but apparently Macs and
> most hardware used for Linux (like Intel) are both little-endian-- so
> it is probably not the answer in this case.
> 
Has to be an implementation difference in that case, e.g UTF-8 vs ASCII
or somebody decided that using an int was wasteful and used a short
instead.

> This is a nice test I found:
> echo -n I | od -to2 | awk '{ print substr($2,6,1); exit}'
> 
> 1 little-endian
> 0 big-endian
> 
Very nice indeed. Thanks for posting it.


Martin



Re: SA bayes file db permission issue

2016-06-10 Thread RW
On Fri, 10 Jun 2016 15:38:44 -0400
Joseph Brennan wrote:

>  wrote:
> 
> > The main database file is binary anyway.  
> 
> 
> Look out for big-endian and little-endian, too. That affects
> databases. This bit us once when we copied a berkeley db from solaris
> to linux. 

That may have changed; they are supposed to be compatible:

http://www.oracle.com/technetwork/database/berkeleydb/db-faq-095848.html

It's just a bit less efficient.

> Endian-ness is based on the cpu hardware, but apparently
> Macs and most hardware used for Linux (like Intel) are both
> little-endian-- so it is probably not the answer in this case.

IIRC older OS X macs used big-endian powerpc processors.


Re: SA bayes file db permission issue

2016-06-10 Thread Joseph Brennan



 wrote:


The main database file is binary anyway.



Look out for big-endian and little-endian, too. That affects databases. 
This bit us once when we copied a berkeley db from solaris to linux. 
Endian-ness is based on the cpu hardware, but apparently Macs and most 
hardware used for Linux (like Intel) are both little-endian-- so it is 
probably not the answer in this case.


This is a nice test I found:
echo -n I | od -to2 | awk '{ print substr($2,6,1); exit}'

1 little-endian
0 big-endian

Joseph Brennan
Columbia U





Re: SA bayes file db permission issue

2016-06-10 Thread RW
On Fri, 10 Jun 2016 00:08:01 +0100
Martin Gregorie wrote:

> On Thu, 2016-06-09 at 15:01 -0700, John Hardin wrote:
> > On Thu, 9 Jun 2016, Martin Gregorie wrote:
> >   
> > > On Thu, 2016-06-09 at 16:54 -0400, Yu Qian wrote:  
> > >> Ok, I found out. so the db files generated on Mac can not be
> > >> used  
> > on  
> > >> Linux. vice versa.  
> > >
> > > Newline symbols differ: '/n' is 0x0a (LF) for Linux, 0x0d (CR) for
> > > Macs.   
> > 
> > WTF? I thought Mac's OS was based on Mach, which is an offshoot of
> > Unix?
> >   
> Since MACs used CR from their debut I thought this got carried over
> into OS-X for file compatibility reasons. Seems that I was wrong
> (except for Excel for OS X, which still uses CR for CSV files.

The main database file is binary anyway.


Re: SA bayes file db permission issue

2016-06-09 Thread Martin Gregorie
On Thu, 2016-06-09 at 15:01 -0700, John Hardin wrote:
> On Thu, 9 Jun 2016, Martin Gregorie wrote:
> 
> > On Thu, 2016-06-09 at 16:54 -0400, Yu Qian wrote:
> >> Ok, I found out. so the db files generated on Mac can not be used
> on
> >> Linux. vice versa.
> >
> > Newline symbols differ: '/n' is 0x0a (LF) for Linux, 0x0d (CR) for
> > Macs. 
> 
> WTF? I thought Mac's OS was based on Mach, which is an offshoot of
> Unix?
> 
Since MACs used CR from their debut I thought this got carried over
into OS-X for file compatibility reasons. Seems that I was wrong
(except for Excel for OS X, which still uses CR for CSV files.


Martin 



Re: SA bayes file db permission issue

2016-06-09 Thread John Hardin

On Thu, 9 Jun 2016, Martin Gregorie wrote:


On Thu, 2016-06-09 at 16:54 -0400, Yu Qian wrote:

Ok, I found out. so the db files generated on Mac can not be used on
Linux. vice versa.


Newline symbols differ: '/n' is 0x0a (LF) for Linux, 0x0d (CR) for
Macs. 


WTF? I thought Mac's OS was based on Mach, which is an offshoot of Unix?

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  People think they're trading chaos for order [by ceding more and
  more power to the Government], but they're just trading normal
  human evil for the really dangerous organized kind of evil, the
  kind that simply does not give a shit. Only bureaucrats can give
  you true evil. -- Larry Correia
---
 170 days since the first successful real return to launch site (SpaceX)

Re: SA bayes file db permission issue

2016-06-09 Thread Larry Rosenman

On 2016-06-09 16:25, Martin Gregorie wrote:

On Thu, 2016-06-09 at 16:54 -0400, Yu Qian wrote:

Ok, I found out. so the db files generated on Mac can not be used on
Linux. vice versa.


Newline symbols differ: '/n' is 0x0a (LF) for Linux, 0x0d (CR) for
Macs. 

The bad news is that this screws up many programs. The good news is
that its easily fixed by using the tr utility or special-purpose text
file conversion programs - provided the files don't contain binary
fields or anything else that that could leave one of these bit patterns
in a byte.


Martin

This is NO LONGER true for Mac OS X.  It's Unix/Unix-like.


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281


Re: SA bayes file db permission issue

2016-06-09 Thread Martin Gregorie
On Thu, 2016-06-09 at 16:54 -0400, Yu Qian wrote:
> Ok, I found out. so the db files generated on Mac can not be used on
> Linux. vice versa.
> 
Newline symbols differ: '/n' is 0x0a (LF) for Linux, 0x0d (CR) for
Macs. 

The bad news is that this screws up many programs. The good news is
that its easily fixed by using the tr utility or special-purpose text
file conversion programs - provided the files don't contain binary
fields or anything else that that could leave one of these bit patterns
in a byte.


Martin



Re: SA bayes file db permission issue

2016-06-09 Thread Yu Qian
Good point, David, I will try as you suggested, that makes more sense.

---
Yu Qian
Ottawa Ontario
Phone: (514)-553-0198



On Thu, Jun 9, 2016 at 5:01 PM, David B Funk 
wrote:

> On Thu, 9 Jun 2016, Yu Qian wrote:
>
> Yes, I am sure the path is correct, also, if the path is not correct, it
>> will show 'db not present'.
>> I tried to write a small perl script to open the db file, it failed too.
>> so I think it maybe the file damaged during the mounting. but I
>> don't know why this can happen
>>
>> ---
>> Yu Qian
>> Ottawa Ontario
>> Phone: (514)-553-0198
>>
>>
>>
>> On Thu, Jun 9, 2016 at 4:24 PM, John Hardin  wrote:
>>   On Thu, 9 Jun 2016, Yu Qian wrote:
>>
>> My spam assassin works pretty well if I run it on a single
>> machine, either
>> mac or linux. that means I update my rules and train my bayes
>> model on the
>> same machine.
>>
>> But when I tried to train the model and generate bayes file
>> db  on mac, and
>> I mounted them to a docker container, then sa-learn failed to
>> read the DB.
>> the permission looks good, because the error just show
>> "failed to open
>> bayes_toks"
>>
>> Anyone know the potential problems?
>>
>>
> Check the version number of the BerkekeyDB libraries on the two different
> machines. There are binary-data compatability issues between some of the
> versions. (EG a db file created by v3.0 cannot be opened by v4.2 IIRC).
>
> You may have to do a bayes "-backup" on the one system and a "-restore"
> on the other.
>
>
> --
> Dave Funk  University of Iowa
> College of Engineering
> 319/335-5751   FAX: 319/384-0549   1256 Seamans Center
> Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
> #include 
> Better is not better, 'standard' is better. B{


Re: SA bayes file db permission issue

2016-06-09 Thread David B Funk

On Thu, 9 Jun 2016, Yu Qian wrote:


Yes, I am sure the path is correct, also, if the path is not correct, it will 
show 'db not present'.
I tried to write a small perl script to open the db file, it failed too. so I 
think it maybe the file damaged during the mounting. but I
don't know why this can happen

---
Yu Qian
Ottawa Ontario
Phone: (514)-553-0198



On Thu, Jun 9, 2016 at 4:24 PM, John Hardin  wrote:
  On Thu, 9 Jun 2016, Yu Qian wrote:

My spam assassin works pretty well if I run it on a single machine, 
either
mac or linux. that means I update my rules and train my bayes model 
on the
same machine.

But when I tried to train the model and generate bayes file db  on 
mac, and
I mounted them to a docker container, then sa-learn failed to read 
the DB.
the permission looks good, because the error just show "failed to 
open
bayes_toks"

Anyone know the potential problems?



Check the version number of the BerkekeyDB libraries on the two different
machines. There are binary-data compatability issues between some of the
versions. (EG a db file created by v3.0 cannot be opened by v4.2 IIRC).

You may have to do a bayes "-backup" on the one system and a "-restore"
on the other.


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: SA bayes file db permission issue

2016-06-09 Thread Yu Qian
Ok, I found out. so the db files generated on Mac can not be used on Linux.
vice versa.

I think this is related to the way how perl DBM module processing the db
files on different system. I am totally new to perl.

But it's good to know that. thanks all.

---
Yu Qian
Ottawa Ontario
Phone: (514)-553-0198



On Thu, Jun 9, 2016 at 4:38 PM, Alan Hodgson 
wrote:

> On Thursday 09 June 2016 16:26:26 Yu Qian wrote:
> > Yes, I am sure the path is correct, also, if the path is not correct, it
> > will show 'db not present'.
> >
> > I tried to write a small perl script to open the db file, it failed too.
> so
> > I think it maybe the file damaged during the mounting. but I don't know
> why
> > this can happen
> >
>
> The docker container probably has a different DB version than your Mac.
>
>


Re: SA bayes file db permission issue

2016-06-09 Thread Alan Hodgson
On Thursday 09 June 2016 16:26:26 Yu Qian wrote:
> Yes, I am sure the path is correct, also, if the path is not correct, it
> will show 'db not present'.
> 
> I tried to write a small perl script to open the db file, it failed too. so
> I think it maybe the file damaged during the mounting. but I don't know why
> this can happen
> 

The docker container probably has a different DB version than your Mac.



Re: SA bayes file db permission issue

2016-06-09 Thread Yu Qian
Yes, I am sure the path is correct, also, if the path is not correct, it
will show 'db not present'.

I tried to write a small perl script to open the db file, it failed too. so
I think it maybe the file damaged during the mounting. but I don't know why
this can happen

---
Yu Qian
Ottawa Ontario
Phone: (514)-553-0198



On Thu, Jun 9, 2016 at 4:24 PM, John Hardin  wrote:

> On Thu, 9 Jun 2016, Yu Qian wrote:
>
> My spam assassin works pretty well if I run it on a single machine, either
>> mac or linux. that means I update my rules and train my bayes model on the
>> same machine.
>>
>> But when I tried to train the model and generate bayes file db  on mac,
>> and
>> I mounted them to a docker container, then sa-learn failed to read the DB.
>> the permission looks good, because the error just show "failed to open
>> bayes_toks"
>>
>> Anyone know the potential problems?
>>
>
> Are you sure the path is correct?
>
> Run sa-learn in debug mode to see where it's looking for the bayes DB.
>
>
> --
>  John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
>  jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
>  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
> ---
>   ...wind turbines are not meant to actually be an efficient way to
>   supply the power grid, rather they're prayer wheels for New Age
>   iBuddhists, their whirring blades drawing white guilt from the
>   atmosphere and pumping it safely underground.-- Tam
> ---
>  170 days since the first successful real return to launch site (SpaceX)
>


Re: SA bayes file db permission issue

2016-06-09 Thread John Hardin

On Thu, 9 Jun 2016, Yu Qian wrote:


My spam assassin works pretty well if I run it on a single machine, either
mac or linux. that means I update my rules and train my bayes model on the
same machine.

But when I tried to train the model and generate bayes file db  on mac, and
I mounted them to a docker container, then sa-learn failed to read the DB.
the permission looks good, because the error just show "failed to open
bayes_toks"

Anyone know the potential problems?


Are you sure the path is correct?

Run sa-learn in debug mode to see where it's looking for the bayes DB.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  ...wind turbines are not meant to actually be an efficient way to
  supply the power grid, rather they're prayer wheels for New Age
  iBuddhists, their whirring blades drawing white guilt from the
  atmosphere and pumping it safely underground.-- Tam
---
 170 days since the first successful real return to launch site (SpaceX)


Re: SA bayes file db permission issue

2016-06-09 Thread Yu Qian
Ok, I think it is just because the db file can not be open by perl DBM
module, but I am confused why it can't be open

---
Yu Qian
Ottawa Ontario
Phone: (514)-553-0198



On Thu, Jun 9, 2016 at 4:11 PM, Yu Qian  wrote:

> My spam assassin works pretty well if I run it on a single machine, either
> mac or linux. that means I update my rules and train my bayes model on the
> same machine.
>
> But when I tried to train the model and generate bayes file db  on mac,
> and I mounted them to a docker container, then sa-learn failed to read the
> DB. the permission looks good, because the error just show "failed to open
> bayes_toks"
>
> Anyone know the potential problems?
>
> thanks
>
>
>
>


SA bayes file db permission issue

2016-06-09 Thread Yu Qian
My spam assassin works pretty well if I run it on a single machine, either
mac or linux. that means I update my rules and train my bayes model on the
same machine.

But when I tried to train the model and generate bayes file db  on mac, and
I mounted them to a docker container, then sa-learn failed to read the DB.
the permission looks good, because the error just show "failed to open
bayes_toks"

Anyone know the potential problems?

thanks