from:"Peter Teoh"

Re: I remove the audit and selinux from kernel and start the new kernel in Centos, but i can't bring up the network device, Why?

2014-10-26 Thread Peter Teoh

Hi Sizel,

just some possibly pertinent question:

On Mon, Oct 13, 2014 at 2:36 PM,  wrote:

> On Mon, 13 Oct 2014 13:04:08 +0800, sizel said:
> > I remove the audit and selinux from kernel and start the new kernel  in
> > Centos, but i can't bring up the network device, Why?
>
> "remove"...how do u do it?  recompile kernel CONFIG with SELINUX=n and
AUDIT=n?   or u just followed standard procedures (like below):

http://www.crypt.gen.nz/selinux/disable_selinux.html
https://www.digitalocean.com/community/tutorials/an-introduction-to-selinux-on-centos-7-part-1-basic-concepts

just like this case:

https://www.centos.org/forums/viewtopic.php?t=30942

I suspect it is just the improper way of configuring/disabling your SELINUX
that you got the errorstry again.



> This would be a lot easier to answer if you gave us some actual details:
>
> 1) Why did you think audit and selinux were the problem? (They probably
> aren't).
> 2) What *exactly is "the network device"? A wireless card? 1G ethernet?
> 10G Ethernet?
> INfiniband? Something else?
> 3) What configuration are you trying to set up?
> 4) What error message, if any, do you get?
> 5) Have you ruled out the easy stuff, like trying to start an Ethernet
> port with a missing/bad cable?
> 6) Why do you think it's a kernel problem and not a userspace problem?
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Removed from eudyptula challenge

2014-09-19 Thread Peter Teoh

On Sat, Sep 20, 2014 at 3:05 AM, Jeshwanth Kumar N K  wrote:

> Hello,
>
> Today I was asking some suggestions in IRC for my eudyptula challenge
> (indirectly, because working for it for 1 month). So I am removed from the
> challenge now.
>
> So, who all doing the challenge please do everything yourself by reading
> the docs, kernel codes or ask little directly. Because, you will feel
> really bad after removing from challenge, anyway my mistake, I shouldn't
> have break the integrity.
>
> And my mistake was I thought I am smart in asking questions and nobody
> will get doubt :). So don't do that :).
>
>
Does not matter, the aim of the whole thing is to learn, and to learn u
either search, read, discuss, or ask.   Not to solve the problem as the
ultimate aim.   Internet has tons of resources to learn true, but we always
want to do targetted learning, to solve narrow range of problems, and in
the fastest, more efficient way.   ASK!

And like Newton said, his knowledge is built on top of the shoulders of
other giants.   So please ask, and help value add to others knowledge, and
solving Eudyptula Challenge COMPLETELY DOES NOT ACHIEVE THAT GOAL, because
everything is hidden and shrouded in secrecy.

Just my 2cts
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: suspend/resume PM criterion for application

2014-09-16 Thread Peter Teoh

On Wed, Sep 17, 2014 at 11:16 AM, Peter Teoh 
wrote:

>
>
> On Sun, Sep 14, 2014 at 2:11 AM, Ran Shalit  wrote:
>
>> On Thu, Sep 11, 2014 at 12:24 PM, Ran Shalit  wrote:
>> > On Thu, Sep 11, 2014 at 8:32 AM, AYAN KUMAR HALDER <
>> ayankum...@gmail.com> wrote:
>> >> On Thu, Sep 11, 2014 at 12:55 AM,   wrote:
>> >>> On Wed, 10 Sep 2014 21:58:48 +0300, Ran Shalit said:
>> >>>
>> >>>> 1. How can I make a process to notice this inactivity ? Do you think
>> >>>> it can be implemented by some periodic process who check if there is
>> >>>> activity ? It returns to the original question I raised, that I will
>> >>>> use some periodic process who checks maybe cpu load or something like
>> >>>> that. What do you think ?
>> >>>
>> >>> That's going to depend on your system and what processes are running.
>> >>>
>> >>> You may have an MP3 player going that doesn't take much CPU at all -
>> but
>> >>> shutting down because the user hasn't hit a button in 47 minutes will
>> probably
>> >>> irritate the user no end.  Or there may be a screensaver running that
>> takes
>> >>> twice as much CPU as the MP3 player, but is totally OK on the system
>> >>> suspending whenever the rest of the system wants it.
>> >>>
>> >>> You're going to have to look at your system design, and decide for
>> yourself
>> >>> what the criteria are.
>> >>
>> >> Please correct me if my understanding is wrong:-
>> >>
>> >> I believe that autosuspend feature (for system suspend) is not present
>> >> in kernel. I believe that there is no feature in kernel which checks
>> >> for system ( cpu, devices) inactivity and suspends the entire system.
>> >> System suspend is caused when :-
>> >> 1. the user issues a command
>> >> 2. The system receives some interrupt or event (lid closing event)
>> >> 3. There is an external process which monitors system inactivity and
>> >> suspends the system.
>> >>
>> >> For runtime suspend of a device, I believe it is the driver who has
>> >> the complete responsibility to decide when to suspend the device or
>> >> resume it.  The driver can take this decision on user intervention (eg
>> >> when user writes to   /sys/devices//power/* ) or when the
>> >> driver has completed servicing an interrupt and feels it has nothing
>> >> more to do, etc
>> >
>> > Thanks Vlaid, Ayan,
>> >
>> > I am a bit yet struggling for couple of days on this PM issue, and I
>> > would appreciate your continous advise.
>> > The system requirement I have is as following:
>> > 1. make everything as automatic as possible , so that there won't be
>> > any need to add any userspace application for the matter.
>> > 2. wakeup from all relevant wakeup sources
>> > 3. should not use sysfs (it should be disabled from kernel)
>> > 4. platform is OMAP3530.
>>
>
> a.   look into /arch/arm/mach-omap2 of kernel source and grep for "sleep"
> and "wakeup" functionality:   power management is just managing with the
> different frequencies of the the CPU.   as far as I can tell, once sleep,
> only uart pin can be used for waking upnot sure.
>
> b.   read this:
>
>
> http://e2e.ti.com/support/dsp/omap_applications_processors/f/447/t/30005.aspx
>
> http://www.ti.com/lit/an/slva310b/slva310b.pdf   (read page 2, which
> describe the different powerup-sequence of the CPU, "Powering-Up Sequence".
>
>
> c.   the technology brand name for omap3530 is "DVFS"search for this
> inside the arch/arm kernel source.you can find lots of sample codes
> there.
>
> (don't confuse with another omap CPU brand name "DeepSleep" but is PM for
> another type of omap cpu.)
>
> d.   http://www.ti.com/product/omap3530 --> on the right is a DVSDK +
> Android source code for 3530grep the codes for the above keywords...
>
> hopefully it helps?
>
>
at the risk of missing out other files:

how about this two files:

inside arch/arm/mach-omap2:

omap-pm.h
omap-pm-noop.c

which I think provide a lot of hint for you.
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: suspend/resume PM criterion for application

2014-09-16 Thread Peter Teoh

On Sun, Sep 14, 2014 at 2:11 AM, Ran Shalit  wrote:

> On Thu, Sep 11, 2014 at 12:24 PM, Ran Shalit  wrote:
> > On Thu, Sep 11, 2014 at 8:32 AM, AYAN KUMAR HALDER 
> wrote:
> >> On Thu, Sep 11, 2014 at 12:55 AM,   wrote:
> >>> On Wed, 10 Sep 2014 21:58:48 +0300, Ran Shalit said:
> >>>
> >>>> 1. How can I make a process to notice this inactivity ? Do you think
> >>>> it can be implemented by some periodic process who check if there is
> >>>> activity ? It returns to the original question I raised, that I will
> >>>> use some periodic process who checks maybe cpu load or something like
> >>>> that. What do you think ?
> >>>
> >>> That's going to depend on your system and what processes are running.
> >>>
> >>> You may have an MP3 player going that doesn't take much CPU at all -
> but
> >>> shutting down because the user hasn't hit a button in 47 minutes will
> probably
> >>> irritate the user no end.  Or there may be a screensaver running that
> takes
> >>> twice as much CPU as the MP3 player, but is totally OK on the system
> >>> suspending whenever the rest of the system wants it.
> >>>
> >>> You're going to have to look at your system design, and decide for
> yourself
> >>> what the criteria are.
> >>
> >> Please correct me if my understanding is wrong:-
> >>
> >> I believe that autosuspend feature (for system suspend) is not present
> >> in kernel. I believe that there is no feature in kernel which checks
> >> for system ( cpu, devices) inactivity and suspends the entire system.
> >> System suspend is caused when :-
> >> 1. the user issues a command
> >> 2. The system receives some interrupt or event (lid closing event)
> >> 3. There is an external process which monitors system inactivity and
> >> suspends the system.
> >>
> >> For runtime suspend of a device, I believe it is the driver who has
> >> the complete responsibility to decide when to suspend the device or
> >> resume it.  The driver can take this decision on user intervention (eg
> >> when user writes to   /sys/devices//power/* ) or when the
> >> driver has completed servicing an interrupt and feels it has nothing
> >> more to do, etc
> >
> > Thanks Vlaid, Ayan,
> >
> > I am a bit yet struggling for couple of days on this PM issue, and I
> > would appreciate your continous advise.
> > The system requirement I have is as following:
> > 1. make everything as automatic as possible , so that there won't be
> > any need to add any userspace application for the matter.
> > 2. wakeup from all relevant wakeup sources
> > 3. should not use sysfs (it should be disabled from kernel)
> > 4. platform is OMAP3530.
>

a.   look into /arch/arm/mach-omap2 of kernel source and grep for "sleep"
and "wakeup" functionality:   power management is just managing with the
different frequencies of the the CPU.   as far as I can tell, once sleep,
only uart pin can be used for waking upnot sure.

b.   read this:

http://e2e.ti.com/support/dsp/omap_applications_processors/f/447/t/30005.aspx

http://www.ti.com/lit/an/slva310b/slva310b.pdf   (read page 2, which
describe the different powerup-sequence of the CPU, "Powering-Up Sequence".


c.   the technology brand name for omap3530 is "DVFS"search for this
inside the arch/arm kernel source.you can find lots of sample codes
there.

(don't confuse with another omap CPU brand name "DeepSleep" but is PM for
another type of omap cpu.)

d.   http://www.ti.com/product/omap3530 --> on the right is a DVSDK +
Android source code for 3530grep the codes for the above keywords...

hopefully it helps?



> >
> > Now, As I understand thus far, I have the following options (
> > requirement 3 above I will ignore, don't know how to handle it yet,
> > and assume for meanwhile that I have sysfs) :
> > 1. use suspend scheme (no runtime PM)
> > 1.a. create some kernel thread who check cpu load and will decide
> > to disable system only if its below some minimum threshold (which
> > should indicate no activity)
> > 1.b. initialize all HW interrupts (gpio, uart, etc) as wakeup sources
> > with this scheme only this thread is responsible for the suspend,
> > and there is no use of the runtime PM, right ?
> >
> > 2. use runtime PM scheme :
> > With this scheme I don't understand how some device will wake the
> > system , or doesn't it need to  ? If a driver wakes up maybe it need
> > to deliver some info to system?
> >
> > I think option 1 is also easier to support, what do you think about both
> ?
> >
> > Thanks!!
> > Ran
>
> Does Anyone have any suggestions and feedback on the above requirements ?
>
> Thank you,
> Ran
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Any char device example for runtime PM ?

2014-09-14 Thread Peter Teoh

On Sat, Sep 13, 2014 at 3:50 PM, Ran Shalit  wrote:

> On Sat, Sep 13, 2014 at 4:14 AM, Peter Teoh 
> wrote:
> > please elaborate your requirements.   char dev is for I/O to hardware.
>  but
> > runtime PM is for hibernating machine.   what is the connection u trying
> to
> > achieve?
> >
> > On Mon, Sep 8, 2014 at 1:22 PM, Ran Shalit  wrote:
> >>
> >> Hello,
> >>
> >> Is there any character device example using runtime PM available ?
> >> It is most helpful,
> >>
> Hi,
>
> Some of the drivers I'm using are char devices, while I only saw
> platform device registration for runtime PM, so my question stem from
> this.
>
> As to the system requirement I have, it is as following:
> 1. make everything as automatic as possible , so that there won't be
> any need to add any userspace application for the matter.
> 2. wakeup from all relevant wakeup sources
> 3. should not use sysfs (it should be disabled from kernel)
> 4. platform is OMAP3530.
>
> Now, As I understand this far, I have the following options (
> requirement 3 above I will ignore, don't know how to handle it yet,
> and assume for meanwhile that I have sysfs) :
> 1. use suspend scheme (no runtime PM)
> 1.a. create some kernel periodic thread who check cpu load and will
> decide
> to disable system only if its below some minimum threshold (which
> should indicate no activity)
> 1.b. initialize all HW interrupts (gpio, uart, etc) as wakeup sources
> with this scheme only this thread is responsible for the suspend,
> and there is no use of the runtime PM, right ?
>
> 2. use runtime PM scheme :
> With this scheme I don't understand how some device will wake the
> system , or doesn't it need to  ? If a driver wakes up maybe it need
> to deliver some info to system?
>
>
as a general comment, your requirement for PM sounds weird.

a.   normally, the linux kernel has its own PM protocoland it governs
which devices to saves states, and restore it later.there is a
hierarchy of calls to be made.   and it is a complex daisy chain from
devices to higher logical level.   but yours never seem to mention or plan
to integrate to this infrastructure?

b.   hardware PM (sorry, i am a software guy...may be wrong) for
microcontroller/CPU normally means different states resulting in different
external PINs being disable, and for the least powered state only one or
two pins are available to wake up the CP/microcontroller.   but when u
mentioned so many pins are potential wake up source..then it is not
powered down at all.

i am being vague and brief, not to waste time, as this is a big topic,
sorry.



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: x86_64_defconfig and i386_defconfig: What is the difference?

2014-09-12 Thread Peter Teoh

On Tue, Sep 9, 2014 at 3:58 PM, Rajat Jain  wrote:

> Hi,
>
> Can someone tell me if the i386 one is to be used when we want to build
> for a 32bit machine and the x86_64 is to be used for 64 bit machine?
>

i386 or 32-bit machines?   i think it don't exists anymore, but what likely
is correct: i386-compatible machine.

The i386 config is for 32-bit OS, ie, the entire binaries must be build for
32-bit architecture.

So choose the correct config provided u have the correct userspace
files/libraries to support it.

>
> Thanks,
>
> Rajat
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Testing Code for Btrfs

2014-09-06 Thread Peter Teoh

some well known filesystem testing tools are listed here:

http://linuxpoison.blogspot.sg/2008/07/linux-filesystem-testing-tools.html

LTP is one of my favorite, very actively updated and basically it focus on
testing kernel as a whole.


On Sat, Sep 6, 2014 at 10:48 AM, nick  wrote:

> Hey Guys,
> After purchasing a hard drive for btrfs testing, I am wondering what areas
> of testing you would like me to do.
> In addition this drive is enterprise based, a Seagate Constellation so
> feel free to hammer it with the tests
> as you wish :), I have no important data on it and don't care about losing
> it.
> Nick
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Questions about Kernel Memory that I didn't find answers in Google - Please Help

2014-08-04 Thread Peter Teoh

And Q2:

Just want to comment that the load address has to be fixed initially,
because unlike normal ELF, after loading ELF, there is a relocation tasks
done by the linker.   In vmlinuz we cannot have relocation, before
executing the kernel is the BIOS / uboot / bootloader etc.   One possible
answer.   Others:

https://groups.google.com/forum/#!topic/comp.os.linux.embedded/0-SAzCqQKFM

And perhaps some of the links below may help you:

http://jianggmulab.blogspot.sg/2010_01_01_archive.html

http://stackoverflow.com/questions/5647279/why-does-the-module-start-from-address-0xbf00

http://www.arm.linux.org.uk/developer/memory.txt

http://en.wikipedia.org/wiki/High_memory

bottomline: keep googling.

Q6 and 7 makes no sense to mesorry.



On Mon, Aug 4, 2014 at 11:22 PM, Lucas Tanure  wrote:

> Thanks!
>
> A quick look in all of that show me that there a lot of information
> about how kernel manage memory.
> But, I will find the answer for question 2, 6 and 7 in it ?
>
> Thanks!
> --
> Lucas Tanure
> +55 (19) 988176559
>
>
> On Sun, Aug 3, 2014 at 8:58 PM, Peter Teoh 
> wrote:
> > I like your curiosities and interests in Linux
> > kernel.
> http://virtuallyhyper.com/2013/07/rhcsa-and-rhce-chapter-10-the-kernel/
> >
> > Instead of answering one by one, I think I will just identify the
> knowledge
> > you are lacking:
> >
> > Memory management (from both x86/intel and linux kernel perspective).
> >
> > There are many many resources out there for you in these area, eg:
> >
> > http://en.wikipedia.org/wiki/Page_table
> > http://en.wikipedia.org/wiki/X86-64
> >
> > (both boring, but just understand it well enough)
> >
> > http://wiki.osdev.org/Paging   (good explanationunderstand it very
> very
> > well).
> >
> > The ultimate classic ebook:
> >
> > https://www.kernel.org/doc/gorman/pdf/understand.pdf
> >
> > And this blog site has tons of good info on intel/memory etc:
> >
> > http://duartes.org/gustavo/blog/post/cpu-rings-privilege-and-protection/
> > http://duartes.org/gustavo/blog/post/anatomy-of-a-program-in-memory/
> >
> > http://virtuallyhyper.com/2013/07/rhcsa-and-rhce-chapter-10-the-kernel/
> >
> > http://www.cse.psu.edu/~anand/spring01/linux/memory.ppt
> >
> > One more thing:
> >
> > "readelf -S -W vmlinux" shows u the sections and the address where the
> > different sections are supposed to be loaded in memory.   If u replace
> the
> > vmlinux with the kernel module, eg: ip_tables.ko, then it says:
> >
> > starting at offset 0x328c blah blah
> >
> > so the loaded address is with respect to ZERO, but then the actual module
> > address is:
> >
> > sudo cat /proc/modules |grep ip_table
> >
> > ip_tables 18106 1 iptable_filter, Live 0xf8bf5000
> >
> > So all the output from your readelf, just add 0xf8bf5000 to it and you
> will
> > get the actual virtual address of that section IN MEMORY.
> >
> > Just only in memory.   In file, the file offset of the section is
> different.
> > And many parts inside the ELF is also different from memory too:   you
> will
> > need to add the virtual load address (above) to the offset as specified
> > inside the relocation tables (objdump -r), and for each section there is
> a
> > separate relocation table (all independent from another, meaning that the
> > different section CAN BE loaded to different parts in memory).
> >
> > Thanks.
> >
> >
> > On Sun, Aug 3, 2014 at 11:59 PM, Lucas Tanure  wrote:
> >>
> >> Hi,
> >>
> >> I'm looking for some site, pdf, book etc, that can answer this
> questions.
> >> For now I have :
> >>
> >>
> http://unix.stackexchange.com/questions/5124/what-does-the-virtual-kernel-memory-layout-in-dmesg-imply
> >>
> >>
> >> I want to understand a few things about the memory and the execution
> >> of Linux kernel.
> >> Taking from a X86 and grub I have:
> >>
> >> 1) Grub loads kernel and root file system in memory, and the vmlinux
> >> has the code to decompress it self, right ? linux
> >>
> >> 2) The address of load kernel is always the same ? And It's at
> >> compilation time that is chosen ?
> >>
> >> 2a) The kernel takes places in 3g-4g memory place, and user space from 0
> >> to 3gb.
> >> But if the pc has only 256mb of memory ?
> >> And when pc has 16gb of memory, the user space will be split in two ?
> >>
> >> 2b) And if kernel has soo many modules that needs more than

Re: Questions about Kernel Memory that I didn't find answers in Google - Please Help

2014-08-03 Thread Peter Teoh

f8178d5c8
>  98d5c8 0050a8 00   A  0   0  8
>   [12] __ksymtab_strings  PROGBITS81792670
>  992670 01cb42   00   A  0   0  1
>   [13] __init_rodata   PROGBITS817af1c0
>9af1c0  e8   00   A  0   0 32
>   [14] __param  PROGBITS817af2a8
> 9af2a8 000b00   00   A  0   0  8
>   [15] __modverPROGBITS817afda8
>9afda8 000258   00   A  0   0  8
>   [16] .dataPROGBITS
> 8180  a0 0e1180   00  WA  0   0 4096
>   [17] .vvarPROGBITS
> 818e2000  ae2000 001000   00  WA  0   0 16
>   [18] .data..percpu   PROGBITS
> c0 015300   00  WA  0   0 4096
>   [19] .init.text   PROGBITS
> 818f9000   cf9000  0503ea   00  AX  0   0 16
>   [20] .init.data  PROGBITS
> 8194a000   d4a00009e4c8   00  WA  0   0 4096
>   [21] .x86_cpu_dev.initPROGBITS819e84c8
> de84c818   00   A  0   0  8
>   [22] .parainstructions PROGBITS819e84e0
>  de84e000bd3c   00   A  0   0  8
>   [23] .altinstructionsPROGBITS819f4220
> df4220 005f40   00   A  0   0  1
>   [24] .altinstr_replacement  PROGBITS819fa160
>   dfa160 001a69   00  AX  0   0  1
>   [25] .iommu_table  PROGBITS819fbbd0
>  dfbbd0 f0   00   A  0   0  8
>   [26] .apicdrivers PROGBITS819fbcc0
>  dfbcc0 20   00  WA  0   0  8
>   [27] .exit.text PROGBITS819fbce0
>dfbce0 0009bc   00  AX  0   0  1
>   [28] .smp_locks  PROGBITS819fd000
> dfd000005000   00   A  0   0  4
>   [29] .data_nosave  PROGBITS81a02000
>  e02000001000   00  WA  0   0  4
>   [30] .bss NOBITS
> 81a03000e03000122000   00  WA  0   0 4096
>   [31] .brk  NOBITS
> 81b25000   e03000425000   00  WA  0   0  1
>   [32] .comment   PROGBITS
> e0300027   01  MS  0   0  1
>   [33] .debug_frame PROGBITS
> e03028002560   00  0   0  8
>   [34] .shstrtab STRTAB
>  e0558800018a 00  0   0  1
>   [35] .symtab  SYMTAB
> e060581a29f8 18 36 43659  8
>   [36] .strtab STRTAB
>  fa8a50180d92 00  0   0  1
> Key to Flags:
>   W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
>   I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
>   O (extra OS processing required) o (OS specific), p (processor specific)
>
> So the vmlinux is loaded in memory like a dd ?
>
> 5) In my function A, inside the module that I wrote, a non-initialized
> variable will take place in non-initialized section that was loaded in
> memory ?
> Or my modules has a new sections for it's own use, and my module is
> loaded my memory like a process, with all his sections?
> So how another module or kernel code will fin my exported
> variable/function ?
>
>
> 6) Let's suppose:
> I have a int variable, with 17 as content, and the address is 0xGG.
> If I stop the linux in this time, read my memory at address 0xGG I
> will got 17, right ?
> 0xGGG will be bigger than 0xc000 always,  right ?
>
>
> 7) Now take int from question and change for:
> struct mystruct * foo = (struct mystruct* ) kmalloc(sizeof(struct
> mystruct));
>
> I will be able to read at address 0xGG the struct that created,
> and it address will be greater than 0xc000, right ?
> But for this struct, the memory will be allocated for ever, until I
> free the pointer, right ?
>
>
>
> Well, this just a start. I really want to understand how kernel is
> run, loaded etc. Any help is appreciate, answering my questions, links
> to read, books to read.
> Actually, I didn't find any book with that kind of information .
>
>
> --
> Lucas Tanure
> +55 (19) 988176559
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org";> em...@kvack.org 
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: userspace stack start and end

2014-08-01 Thread Peter Teoh

and look into the function "print_context_stack()" which will teach u how
to identify the start/end of stack, whether the address is valid, how to
traverse from one frame to another (using RBP / EBP of course, so CONFIG
for framepointer is definitely needed).


On Sat, Aug 2, 2014 at 12:22 AM, Peter Teoh  wrote:

> FYI, there are many different types of kernel stack:
>
> http://www.x86-64.org/pipermail/discuss/2005-April/005944.html
>
>
> On Mon, Jul 28, 2014 at 12:52 AM, Xin Tong  wrote:
>
>> I am trying to find the start and end address of the userspace stack. I
>> see in the task_struct there is start_stack. But I could not find end_start
>> anywhere in the kernel code ?
>>
>> Can someone please tell me how to find the end of the stack ?
>>
>> Thanks,
>> Xin
>>
>> ___
>> Kernelnewbies mailing list
>> Kernelnewbies@kernelnewbies.org
>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>
>>
>
>
> --
> Regards,
> Peter Teoh
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: userspace stack start and end

2014-08-01 Thread Peter Teoh

FYI, there are many different types of kernel stack:

http://www.x86-64.org/pipermail/discuss/2005-April/005944.html


On Mon, Jul 28, 2014 at 12:52 AM, Xin Tong  wrote:

> I am trying to find the start and end address of the userspace stack. I
> see in the task_struct there is start_stack. But I could not find end_start
> anywhere in the kernel code ?
>
> Can someone please tell me how to find the end of the stack ?
>
> Thanks,
> Xin
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: building kernel with -O

2014-07-30 Thread Peter Teoh

If function is built with framepointer, then EBP + 4 == return address of
the caller of then present function.Because by convention, the entire
function usually don't touch the EBP's value, so with respect to that, u
can always retrieve the return address of the caller.   (which is what this
function does).

and u ask if is not compiled inline?   Then __builtin_return_address()
become a function itself?   Then u are getting the caller of
"__builtin_return_address".   That was not the original intention.   Its
purpose is to get the caller address of the current function.


On Thu, Jul 31, 2014 at 9:05 AM, Xin Tong  wrote:

> In that case, the __builtin_return_address(level) level > 1 is not
> possible either ? what if the kernel uses this ?
>
> Xin
>
>
> On Wed, Jul 30, 2014 at 8:00 PM, Peter Teoh 
> wrote:
>
>>
>>
>>
>> On Thu, Jul 31, 2014 at 12:59 AM, Xin Tong  wrote:
>>
>>> why can not __builtin_return_address() be made *never* inline and use
>>> current level+1 to get the return address of the function of interest.  For
>>> any stack introspection, having 1 more level will not hurt functionality.
>>>
>>
>> Actually, the answer for your remark is "impossible" - in the case when
>> the kernel is compiled without frame pointer.   (CONFIG_FRAME_POINTER=n)
>> which is true for certain variant of RHEL / CentOS.   Without the
>> availability of EBP on the stack, there is no way to know when to stop
>> reading the stack to retrieve the previous stackframe.   Of course u can
>> statically walk the disassembly of the function and see how much stack
>> space the particular function has allocated.   But that requires
>> implementing a disassembler in the kernel.
>>
>>
>>
>>>
>>> given its explanation below
>>>
>>> — Built-in Function: void * *__builtin_return_address* (unsigned int
>>> level)
>>>
>>> This function returns the return address of the current function, or of
>>> one of its callers. The level argument is number of frames to scan up
>>> the call stack. A value of 0 yields the return address of the current
>>> function, a value of 1 yields the return address of the caller of the
>>> current function, and so forth. When inlining the expected behavior is that
>>> the function returns the address of the function that is returned to. To
>>> work around this behavior use the noinline function attribute.
>>>
>>>
>>>
>>>
>>
>> --
>> Regards,
>> Peter Teoh
>>
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: IRQ mismatch ifconfig and /proc/interrupts

2014-07-30 Thread Peter Teoh

I suspect it is a bug, mine is Ubuntu 3.2.0-32 pae kernel, 12.04 32-bit:

cat /proc/interrupts
   CPU0   CPU1   CPU2   CPU3
41:  0  0  0  0   PCI-MSI-edge  eth0

eth0  Link encap:Ethernet  HWaddr 5c:f9:dd:75:54:d8
  Interrupt:41 Base address:0x8000

everything matched.



On Fri, Jul 25, 2014 at 9:58 PM, Oscar Salvador <
osalvador.vilard...@gmail.com> wrote:

> Hi People! How are you doing?
>
> I'm writting to you because I have a doubt about interrupts.
>
> If I look the interrupts assigned to my eth* with ifconfig, I get:
>
> eth0  Link encap:Ethernet  HWaddr bb:aa:bb:bb:aa:aa
>   Interrupt:20 Memory:f7e0-f7e2
>
> eth1  Link encap:Ethernet  HWaddr bb:aa:bb:bb:aa:aa
>   Interrupt:18 Memory:f7d0-f7d2
>
> As you can see, my system assigned IRQ-20 and IRQ-18 to eth0 and eth1.
>
> But If i look into /proc/interrupts, I don't have these interrupts:
>
> root@oscar:/home/oscar# cat /proc/interrupts
>CPU0   CPU1   CPU2   CPU3   CPU4   CPU5
>   CPU6   CPU7
>   0: 15  0  0  0  0  0
>  0  0  IR-IO-APIC-edge  timer
>   8:  0  1  0  0  0  0
>  0  0  IR-IO-APIC-edge  rtc0
>   9:  0  0  0  0  0  2
>  1  0  IR-IO-APIC-fasteoi   acpi
>  16: 191342  27819  25143  21231  19007  18159
>  17183  15717  IR-IO-APIC-fasteoi   ehci_hcd:usb3
>  19: 15  7  0  0  2  9
>  1  4  IR-IO-APIC-fasteoi   firewire_ohci
>  23:   1441 76 61 42101 55
> 29 23  IR-IO-APIC-fasteoi   ehci_hcd:usb4
>  40:  0  0  0  0  0  0
>  0  0  DMAR_MSI-edge  dmar0
>  41:  0  0  0  0  0  0
>  0  0  DMAR_MSI-edge  dmar1
>  42:  0  0  0  0  0  0
>  0  0  IR-PCI-MSI-edge  xhci_hcd
>  43:  27318   1788   1314   1414   4046   2273
>   2232   2059  IR-PCI-MSI-edge  eth0
>  44: 115244  14686  10096   8738  41559  16021
>  10972  10090  IR-PCI-MSI-edge  ahci
>  45: 197010  19487  45260  14687  43697  29520
>  24546  21590  IR-PCI-MSI-edge  eth1-rx-0
>  46:  27239  20276  18861  14845  54218  17950
>  12907   9765  IR-PCI-MSI-edge  eth1-tx-0
>  47:  0  0  1  0  0  0
>  0  1  IR-PCI-MSI-edge  eth1
>  48:262150 78 60261249
>168 47  IR-PCI-MSI-edge  snd_hda_intel
>  49: 857324  80338  67789  59555 682632  90385
>  78616  65048  IR-PCI-MSI-edge  i915
>
>
> As you can see, seems to be that eth1 has IRQ-45 IRQ-46 and IRQ-47, and
> eth0 has IRQ-43.
> I don't understand why ifconfig shows another IRQ.
>
> Is this a normal behaviour? Someone would be so kind to explain me this?
>
> Or maybe throw me some paper that explains this.
>
> thank you very much
> Best Regards
>
> Oscar
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: building kernel with -O

2014-07-30 Thread Peter Teoh

On Thu, Jul 31, 2014 at 12:59 AM, Xin Tong  wrote:

> why can not __builtin_return_address() be made *never* inline and use
> current level+1 to get the return address of the function of interest.  For
> any stack introspection, having 1 more level will not hurt functionality.
>

Actually, the answer for your remark is "impossible" - in the case when the
kernel is compiled without frame pointer.   (CONFIG_FRAME_POINTER=n) which
is true for certain variant of RHEL / CentOS.   Without the availability of
EBP on the stack, there is no way to know when to stop reading the stack to
retrieve the previous stackframe.   Of course u can statically walk the
disassembly of the function and see how much stack space the particular
function has allocated.   But that requires implementing a disassembler in
the kernel.

>
> given its explanation below
>
> — Built-in Function: void * *__builtin_return_address* (unsigned int level
> )
>
> This function returns the return address of the current function, or of
> one of its callers. The level argument is number of frames to scan up the
> call stack. A value of 0 yields the return address of the current
> function, a value of 1 yields the return address of the caller of the
> current function, and so forth. When inlining the expected behavior is that
> the function returns the address of the function that is returned to. To
> work around this behavior use the noinline function attribute.
>
>
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: rootkits blocking using virtualization??

2014-07-30 Thread Peter Teoh

this is a recent classic bug implementing ideas like you mentioned:

http://xenbits.xenproject.org/xsa/advisory-98.html

All mapping are done on hosts side.   But the kernelnewbies is proposing
something from the guest side, but if I have control over the guest OS (as
a rootkit), then I also can undo what the protection has done -
potentially.depending on available exploitable path of entry.




On Thu, Jul 31, 2014 at 8:31 AM, Peter Teoh  wrote:

> Are u referring to this:
>
> http://kernelnewbies.org/KernelProjects/VirtRootkitBlocker
>
> Just trying to answer your question:
>
> --Is the method of making kernel read only to block rootkits used in linux
> kernel mainline?
>
> I suspect not.   How are u going to distinguish between "legitimate
> program" and "rootkit" program?   Program includes both userland program
> and kernel modules.This distinction is needed, because legitimate
> kernel modules can call "kmalloc" and that is read/writeable kernel memory.
>   Supposed there is a vulnerability in the kernel modules (and thus
> userspace program can escalate privilege and execute into) then the
> "kmalloc" is executed on behalf of the malware, but outwardly it looks as
> if the kernel module is making a memory allocation.Unless u record down
> all the potential legitimate kernel execution path (sequence of EIP
> addresses), and compare it dynamically with the redirected path (as
> triggered by the malware), it seemed like impossible to distinguish.   And
> the database of path is also going to be very huge.
> Let me know if u have alternative ideas about setting kernel memory
> readonly.
>
> But on the other hand, this idea is also not new, explored before, for
> virtualization protection, NOT for rootkit detection.
>
> When u virtualized OS, the host has to set the all the memory given to the
> guest as readonly.   For details:
>
> For KVM:
>
> http://www.linux-kvm.org/wiki/images/3/33/KvmForum2008$kdf2008_15.pdf
>
> For Xen:
>
> http://wiki.xen.org/wiki/X86_Paravirtualised_Memory_Management
> http://lists.xen.org/archives/html/xen-devel/2009-10/msg01201.html
>
> And this page has good info:
>
> http://www.linux-kvm.org/page/Memory
>
> (read esp the "shadow page memory" mechanism, which is very expensive, and
> somewhat like the ideas proposed in the kernelnewbies mentor page).
>
>
>
> On Wed, Jul 30, 2014 at 7:44 PM, Aniket Shinde <
> universalvirus@gmail.com> wrote:
>
>> Hello guys,
>> I was going through kernelnewbies.org and came across a project
>> "Block Rootkits using Virtualization" by riel.
>>  Basically we have to make kernel read only after boot process
>> completes so rootkits get blocked.
>>  I have few doubts...
>>
>> --Is the method of making kernel read only to block rootkits used in
>> linux kernel mainline?
>>
>> --have anybody implenented this project already?
>>
>> --what is the good way to start with above project?
>>
>> --any guidelines to implemnet above project??
>>
>> --can I get any menor??
>>
>> --any material related to above project??
>>
>> (note: i have requested to mailing list but have not been approved yet.
>> So please reply me personely.)
>>
>> ___
>> Kernel-mentors mailing list
>> kernel-ment...@selenic.com
>> http://selenic.com/mailman/listinfo/kernel-mentors
>>
>>
>
>
> --
> Regards,
> Peter Teoh
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: rootkits blocking using virtualization??

2014-07-30 Thread Peter Teoh

Are u referring to this:

http://kernelnewbies.org/KernelProjects/VirtRootkitBlocker

Just trying to answer your question:

--Is the method of making kernel read only to block rootkits used in linux
kernel mainline?

I suspect not.   How are u going to distinguish between "legitimate
program" and "rootkit" program?   Program includes both userland program
and kernel modules.This distinction is needed, because legitimate
kernel modules can call "kmalloc" and that is read/writeable kernel memory.
  Supposed there is a vulnerability in the kernel modules (and thus
userspace program can escalate privilege and execute into) then the
"kmalloc" is executed on behalf of the malware, but outwardly it looks as
if the kernel module is making a memory allocation.Unless u record down
all the potential legitimate kernel execution path (sequence of EIP
addresses), and compare it dynamically with the redirected path (as
triggered by the malware), it seemed like impossible to distinguish.   And
the database of path is also going to be very huge.
Let me know if u have alternative ideas about setting kernel memory
readonly.

But on the other hand, this idea is also not new, explored before, for
virtualization protection, NOT for rootkit detection.

When u virtualized OS, the host has to set the all the memory given to the
guest as readonly.   For details:

For KVM:

http://www.linux-kvm.org/wiki/images/3/33/KvmForum2008$kdf2008_15.pdf

For Xen:

http://wiki.xen.org/wiki/X86_Paravirtualised_Memory_Management
http://lists.xen.org/archives/html/xen-devel/2009-10/msg01201.html

And this page has good info:

http://www.linux-kvm.org/page/Memory

(read esp the "shadow page memory" mechanism, which is very expensive, and
somewhat like the ideas proposed in the kernelnewbies mentor page).

On Wed, Jul 30, 2014 at 7:44 PM, Aniket Shinde  wrote:

> Hello guys,
> I was going through kernelnewbies.org and came across a project
> "Block Rootkits using Virtualization" by riel.
>  Basically we have to make kernel read only after boot process
> completes so rootkits get blocked.
>  I have few doubts...
>
> --Is the method of making kernel read only to block rootkits used in linux
> kernel mainline?
>
> --have anybody implenented this project already?
>
> --what is the good way to start with above project?
>
> --any guidelines to implemnet above project??
>
> --can I get any menor??
>
> --any material related to above project??
>
> (note: i have requested to mailing list but have not been approved yet. So
> please reply me personely.)
>
> ___
> Kernel-mentors mailing list
> kernel-ment...@selenic.com
> http://selenic.com/mailman/listinfo/kernel-mentors
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: global descriptor table in X86 Linux

2014-07-29 Thread Peter Teoh

search for "lgdt" here (for 32-bit kernel):

http://lxr.free-electrons.com/source/arch/x86/kernel/head_32.S


On Tue, Jul 29, 2014 at 4:04 AM, Xin Tong  wrote:

> Hi
>
> Ive heard that Linux uses the flat mode segmentation, i.e. the
> segmentation base is forced to be 0 and the limit to be 2^64.
>
> I am having trouble finding the kernel code that sets up the GDT, can
> someone please point me to the right direction.
>
> Thanks a lot.
> Xin
>
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Doubt Regarding Floating Point Arithmetic

2014-07-29 Thread Peter Teoh

You are welcome.

To sidetrack, there is a longstanding vulnerability/security bug or just a
"feature" of linux kernel though:

If you compile any program with "float" or "double" type declaration, you
will see that a lot of "XMM" registers and its instruction set being used.
  But searching the entire kernel source for XMM, we know the kernel don't
touch these registers.

So if u were to do your security keys calculation on these registers, then
beware that upon being context-switched (which can happened anytime, beyond
your control), another process can easily view all the XMM registers
contents, and thus potentially looking at your secret keys.

Same goes with the GPU as well (which has been commonly used for password
cracking) - simply because the kernel don't touch these "memory" sources
inside the kernel, and thus cross-process it is possible to have
information leakage.





On Wed, Jul 30, 2014 at 12:31 AM, Prasad Ram 
wrote:

> Thanks @Peter a very good explanation and it's very help full to me.
>
>
> On 29 July 2014 19:49, Peter Teoh  wrote:
>
>> Perhaps a little explanation:anything that can be done at userspace,
>> should not be done at the kernel, simply because doing at the kernel
>> entailed a lot of security privileges being available.   (ie, logic which
>> require hardware interaction / access, process scheduling logic or anything
>> cutting across processes, sharing of common resources like memory etc)
>> floating point arithmetics is a good example which is not necessary to be
>> done in the kernel.   Lots of hardware registers are available for FPU
>> stuff (SSE/SSE2/XMM registers etc):
>>
>> http://en.wikipedia.org/wiki/SSE2
>> http://www.godevtool.com/TestbugHelp/XMMintins.htm
>> http://x86.renejeschke.de/html/file_module_x86_id_117.html
>>
>> and generally their usage entailed a lot of performance hits when used
>> extensively (another good reason to avoid it).   And more importantly,
>> context switching as  provided by Intel processor, the hardware operation
>> does not include the floating pointers registers (simply because there are
>> so many of them, and XMM can be like 128 bytes long?)   Context switching
>> will swap out the entire registers set when switching from one process to
>> another, and if u were to do this for all the process, when 99% of the time
>> floating point are not in use, it is a terrible waste of CPU cycle.
>>
>> Userspace can only interact with the kernel through well-defined syscall
>> - for purpose of security, interprocess, or hardware access etc.   So
>> generally it is not possible to schedule floating point instruction (or any
>> user-defined instructions for that matter) to be executed in the kernel.
>>
>> But it is possible to schedule floating point arithmetics to be executed
>> in the kernel indirectly, for example, when u have a special hardware like
>> DSP that does floating point arithmetics, and u wrote a driver to schedule
>> instructions to be executed in that hardware unit.  And u have to worry
>> about many processes concurrently sending instructions to the same unit as
>> well.
>>
>> Thanks for the reading.
>>
>>
>>
>> On Wed, Jul 23, 2014 at 11:15 AM, me storage 
>> wrote:
>>
>>> Hi
>>> I am reading LDD .In that i didn't understand one point .In Chapter
>>> 2(Building and Running Modules) they mentioned that
>>>  " Kernel code cannot do floating point arithmetic"
>>> .My doubt is which code is used for floating point arithmetic that means
>>> at low level?
>>>
>>> Thank you
>>>
>>> ___
>>> Kernelnewbies mailing list
>>> Kernelnewbies@kernelnewbies.org
>>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>>
>>>
>>
>>
>> --
>> Regards,
>> Peter Teoh
>>
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Doubt Regarding Floating Point Arithmetic

2014-07-29 Thread Peter Teoh

Perhaps a little explanation:anything that can be done at userspace,
should not be done at the kernel, simply because doing at the kernel
entailed a lot of security privileges being available.   (ie, logic which
require hardware interaction / access, process scheduling logic or anything
cutting across processes, sharing of common resources like memory etc)
floating point arithmetics is a good example which is not necessary to be
done in the kernel.   Lots of hardware registers are available for FPU
stuff (SSE/SSE2/XMM registers etc):

http://en.wikipedia.org/wiki/SSE2
http://www.godevtool.com/TestbugHelp/XMMintins.htm
http://x86.renejeschke.de/html/file_module_x86_id_117.html

and generally their usage entailed a lot of performance hits when used
extensively (another good reason to avoid it).   And more importantly,
context switching as  provided by Intel processor, the hardware operation
does not include the floating pointers registers (simply because there are
so many of them, and XMM can be like 128 bytes long?)   Context switching
will swap out the entire registers set when switching from one process to
another, and if u were to do this for all the process, when 99% of the time
floating point are not in use, it is a terrible waste of CPU cycle.

Userspace can only interact with the kernel through well-defined syscall -
for purpose of security, interprocess, or hardware access etc.   So
generally it is not possible to schedule floating point instruction (or any
user-defined instructions for that matter) to be executed in the kernel.

But it is possible to schedule floating point arithmetics to be executed in
the kernel indirectly, for example, when u have a special hardware like DSP
that does floating point arithmetics, and u wrote a driver to schedule
instructions to be executed in that hardware unit.  And u have to worry
about many processes concurrently sending instructions to the same unit as
well.

Thanks for the reading.

On Wed, Jul 23, 2014 at 11:15 AM, me storage 
wrote:

> Hi
> I am reading LDD .In that i didn't understand one point .In Chapter
> 2(Building and Running Modules) they mentioned that
>  " Kernel code cannot do floating point arithmetic"
> .My doubt is which code is used for floating point arithmetic that means
> at low level?
>
> Thank you
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: how to determine kernel interrupt latency

2014-03-16 Thread Peter Teoh

never use any of the following software beforejust suggest based on
reading online:

cyclictest:   read this:

http://www.spinics.net/lists/linux-rt-users/msg04088.html(as explained
within, "timer interrupt latency" is what is being measured by
cyclictest.not sure how it is done).

More on cyclictest:

http://elinux.org/images/0/01/Elc2013_rowand.pdf
http://people.redhat.com/williams/latency-howto/rt-latency-howto.txt
(last section on different ways of using cyclictest for interrupt latency
is covered.)

another:

https://github.com/atlas555/rt-test   (interrupt_tool)

another is "intrperf" which originate from FreeBSD but there is a Linux
version.   google for it  (
https://repos.dcl.info.waseda.ac.jp/spumone/trac/wiki/Interrupt%20Latency%20of%20Linux%20(intrperf)???).

As highlighted here:

http://marc.info/?l=linux-smp&m=102733872816465

interrupt latencies is really in such smaller order magnitude-wise, or due
to low-level CPU feature like I-cache

http://marc.info/?l=linux-arm-kernel&m=107472646713656

that it is not easy nor worth the time tuning.may be I am wrong.

Check this for another discussion:

http://marc.info/?t=10740345885&r=1&w=2

On Sun, Mar 16, 2014 at 9:09 PM, loody  wrote:

> hi peter:
>
>
> 2014-01-17 13:41 GMT+08:00 Peter Teoh :
> >
> http://stackoverflow.com/questions/15383259/are-there-any-kernel-tools-available-to-measure-interrupt-latency-with-reasonabl
> >
> > checkout cyclictest.
> I have checked cyclictest.
> from manual page, it seems used to calculate thread latency instead of
> interrupt latency
> Would you please let me know if there any kind of command for using it
> to check interrupt latency ?
> thanks fo

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Pass through kernel memory manager

2014-02-19 Thread Peter Teoh

the parameter you passed in section start looks weird, given that your
physical memory so limited.   (8K and 128K, 2 different bank? if so then
only one is available at any one time?),

Perhaps some knowledge about linker-script should help:

http://blogs.bu.edu/md/2011/11/15/the-dark-art-of-linker-scripts/

the "1:1" mapping is called identity mapping, and linker script provide a
way for you to load the binary into specific part of the physical memory,



On Sat, Feb 8, 2014 at 4:29 AM, Paul Chavent  wrote:

> Hi
>
> I'm working on an ARM926EJS based SOM (OMAPL138). The ARM has internal
> memory spaces (8k one and 128k one) where i would like to put some code.
>
> I thought to use something like :
>
> void foobar (void) __attribute__ ((section ("bar")));
>
> Then link with
>
> -Wl,--section-start,bar=1000
>
>
> But the Linux loader fails to load this segment.
>
> So, is it worth to try to achieve to run code at desired position ?
>
> Is there any way to tell Linux to 1:1 map some physical regions to
> processes address space ? Perhaps the memmap= kernel parameter ?
>
> Thanks for your help.
>
> Paul.
>
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Firmware Loading every boot?

2014-02-13 Thread Peter Teoh

FYI, the "firmware" are loaded from flash:

http://en.wikipedia.org/wiki/Flash_memory

which means microcontroller (or microprocessor) + DMA/DDR memory + flash
are the usual makeup of an embedded system.   flash are non-volatile, but
normally it is slower and cannot be executed as CPU or microncontroller
instruction.   which is why you will need to load it into memory to be
executed:

http://lwn.net/Articles/135472/

cheers.

On Mon, Feb 10, 2014 at 9:29 PM, Jeshwanth  wrote:

> Hello List,
>
> I came to know that, linux loads firmware for my dma everytime it boots.
> But I don't understand, why it is required to load everytime it boots,
> don't dma holds which is loaded previously.
> AFAIK, firmware is a program which runs in devices.
>
> Please correct me if I am wrong.
>
> Thanks :)
>
> Regards,
> Jeshwanth
>
> Sent from my HTC
>
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: how to determine kernel interrupt latency

2014-01-16 Thread Peter Teoh

http://stackoverflow.com/questions/15383259/are-there-any-kernel-tools-available-to-measure-interrupt-latency-with-reasonabl

checkout cyclictest.


On Sat, Jan 11, 2014 at 3:25 PM, loody  wrote:

> hi all:
> is it possible to determine interrupt latency in kernel with any ftrace or
> proc?
>
> --
> Regards,
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: How to access a DRM CRTC's scan out buffer?

2014-01-16 Thread Peter Teoh

As indicated here:

http://www.botchco.com/agd5f/?p=51

the input to CRTC is the framebuffer, and output of CRTC is already
monitor-level information...which is meaningless to you.   So but best bet
is to get it at the framebuffer level?

Correct me if wrong?


On Thu, Jan 16, 2014 at 8:06 PM, Sannu K  wrote:

> On Thu, Jan 16, 2014 at 1:14 PM, Peter Teoh wrote:
>
>> In general how it worked is explained here:
>>
>> https://www.kernel.org/doc/htmldocs/drm/drm-kms-init.html
>>
>> Not sure which is the name of your video card, but I think in general all
>> the page flip API should have access to the scan buffer (see link above).
>> For Intel these are possible APIs
>>  :
>>
>>
> Thanks. I was trying to find out a generic way to access the scan out
> buffer. The page flip functions looks specific to hardware.
>
>
>>
>> static void do_intel_finish_page_flip(struct drm_device *dev,
>> void intel_finish_page_flip(struct drm_device *dev, int pipe)
>> do_intel_finish_page_flip(dev, crtc);
>> void intel_finish_page_flip_plane(struct drm_device *dev, int plane)
>> do_intel_finish_page_flip(dev, crtc);
>> void intel_prepare_page_flip(struct drm_device *dev, int plane)
>>  * is also accompanied by a spurious intel_prepare_page_flip().
>> inline static void intel_mark_page_flip_active(struct intel_crtc
>> *intel_crtc)
>>
>> --
>> Regards,
>> Peter Teoh
>>
>
> It is enough to have a way to get the content of scan out buffer instead
> of accessing it directly using a pointer.
>
> Thanks for you help,
> Sannu K
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: How to access a DRM CRTC's scan out buffer?

2014-01-15 Thread Peter Teoh

For ATI GPU the crtc_base could be the base pointer to the memory buffer:

./drivers/gpu/drm/radeon/rv770.c:
u32 rv770_page_flip(struct radeon_device *rdev, int crtc_id, u64 crtc_base)

./drivers/gpu/drm/radeon/rs600.c:
void rs600_pre_page_flip(struct radeon_device *rdev, int crtc)
void rs600_post_page_flip(struct radeon_device *rdev, int crtc)
u32 rs600_page_flip(struct radeon_device *rdev, int crtc_id, u64 crtc_base)

As to the internals of these buffer area, well, u may need the datasheet
from the vendor.   Just grep for "CRTC" inside the gpu/drm/radeon directory
and you can understand why.


On Thu, Jan 16, 2014 at 3:44 PM, Peter Teoh  wrote:

> In general how it worked is explained here:
>
> https://www.kernel.org/doc/htmldocs/drm/drm-kms-init.html
>
> Not sure which is the name of your video card, but I think in general all
> the page flip API should have access to the scan buffer (see link above).
> For Intel these are possible APIs
> :
>
>
> static void do_intel_finish_page_flip(struct drm_device *dev,
> void intel_finish_page_flip(struct drm_device *dev, int pipe)
> do_intel_finish_page_flip(dev, crtc);
> void intel_finish_page_flip_plane(struct drm_device *dev, int plane)
> do_intel_finish_page_flip(dev, crtc);
> void intel_prepare_page_flip(struct drm_device *dev, int plane)
>  * is also accompanied by a spurious intel_prepare_page_flip().
> inline static void intel_mark_page_flip_active(struct intel_crtc
> *intel_crtc)
>
>
> On Sat, Jan 11, 2014 at 9:27 PM, Sannu K  wrote:
>
>> Hi,
>>
>> I would like to access a monitor's content in kernel mode. I tried but
>> could not find a generic way to access CRTC's scan out buffer in kernel
>> mode. I prefer to do it in kernel mode as an experiment. Any pointers will
>> greatly help.
>>
>> Thanks and Regards,
>> Sannu K
>>
>> ___
>> Kernelnewbies mailing list
>> Kernelnewbies@kernelnewbies.org
>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>
>>
>
>
> --
> Regards,
> Peter Teoh
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: How to access a DRM CRTC's scan out buffer?

2014-01-15 Thread Peter Teoh

In general how it worked is explained here:

https://www.kernel.org/doc/htmldocs/drm/drm-kms-init.html

Not sure which is the name of your video card, but I think in general all
the page flip API should have access to the scan buffer (see link above).
For Intel these are possible APIs
:

static void do_intel_finish_page_flip(struct drm_device *dev,
void intel_finish_page_flip(struct drm_device *dev, int pipe)
do_intel_finish_page_flip(dev, crtc);
void intel_finish_page_flip_plane(struct drm_device *dev, int plane)
do_intel_finish_page_flip(dev, crtc);
void intel_prepare_page_flip(struct drm_device *dev, int plane)
 * is also accompanied by a spurious intel_prepare_page_flip().
inline static void intel_mark_page_flip_active(struct intel_crtc
*intel_crtc)

On Sat, Jan 11, 2014 at 9:27 PM, Sannu K  wrote:

> Hi,
>
> I would like to access a monitor's content in kernel mode. I tried but
> could not find a generic way to access CRTC's scan out buffer in kernel
> mode. I prefer to do it in kernel mode as an experiment. Any pointers will
> greatly help.
>
> Thanks and Regards,
> Sannu K
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: DMA, CMA, coherence and performance

2014-01-05 Thread Peter Teoh

I think this discussion should help you:

http://e2e.ti.com/support/embedded/linux/f/354/t/89419.aspx

other failures:

http://stackoverflow.com/questions/14625919/allocating-a-large-dma-buffer

and some guideline here:

https://www.kernel.org/doc/Documentation/DMA-API.txt

https://lkml.org/lkml/2011/3/25/19

As I don't have any specific crashdump or error information, nothing I can
comment further about your problem.   It is quite difficult to make general
comment.



On Fri, Jan 3, 2014 at 7:20 AM, Steven Bell  wrote:

> Hi,
>
> I'm working on a device driver for a video device which continuously reads
> and writes image frames using DMA. The frames are fairly large, in the
> range of 2-8MB, and I would like the buffers for them to be contiguous
> because of my hardware. My understanding is that using the contiguous
> memory allocator is the current "right way" to get the buffers, and that
> CMA operates entirely behind the scenes when calls are made to
> dma_alloc_coherent().
>
> However, it seems that for this system, a streaming DMA setup would be
> more appropriate.  The buffer gets filled with data once, handed to the
> device, and then isn't touched again until it gets reused with new data.
> The resources I've read have hinted that streaming DMA has some performance
> benefits over coherent DMA, so this seems like the way to go.  But I
> haven't seen any discussion of how to use CMA with streaming DMA (or
> whether such a thing is even necessary).
>
> Does the CMA also work behind get_free_pages, or other kernel memory
> allocation methods?  Does it matter?  The kernel newbies page on memory
> allocation (http://kernelnewbies.org/KernelMemoryAllocation) says that
> get_free_pages up to about 8MB are ok.  Is that a generalization based on
> typical memory fragmentation, or a guarantee?
>
> Thanks,
> Steven
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Bug 12665

2014-01-02 Thread Peter Teoh

this list (Linux-API) focus on adding new API to the linux platform.   So
perhaps this one about timing may get you started:

http://www.spinics.net/lists/linux-api/msg02243.html

or in general:

https://www.google.com.sg/search?q=site%3Awww.spinics.net%2Flists%2Flinux-api%2F+time


On Fri, Jan 3, 2014 at 2:43 AM, johnd  wrote:

> On Tue, Dec 24, 2013 at 02:19:30PM +0800, Peter Teoh wrote:
> > the DELAYTIMER_MAX is for realtime POSIX.
> >
> > but Linux is based on http://en.wikipedia.org/wiki/Linux_Standard_Base,
> > which is LSB.
> >
> > There is no direct mapping between LSB and POSIX, but perhaps this:
> >
> > http://man7.org/linux/man-pages/man7/time.7.html
> >
> > and
> >
> > http://pubs.opengroup.org/onlinepubs/7908799/xsh/timer_gettime.html
> >
> > Look carefully between the two and you can perhaps find the balancing
> point
> > u will need for implementing this feature.
>
> Thanks for the explanation.  I was just looking at bugs in bugzilla that
> I could actually reproduce.  I'm just getting started with kernel
> programming and am looking for bugs I can observe.
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Bug 12665

2013-12-23 Thread Peter Teoh

reading the specs:

http://pubs.opengroup.org/onlinepubs/7908799/xsh/timer_gettime.html

the DELAYTIMER_MAX is for realtime POSIX.

but Linux is based on http://en.wikipedia.org/wiki/Linux_Standard_Base,
which is LSB.

There is no direct mapping between LSB and POSIX, but perhaps this:

http://man7.org/linux/man-pages/man7/time.7.html

and

http://pubs.opengroup.org/onlinepubs/7908799/xsh/timer_gettime.html

Look carefully between the two and you can perhaps find the balancing point
u will need for implementing this feature.

whether it is a kernel bug, or userspace bug is therefore highly
controversial.

On Tue, Dec 17, 2013 at 1:29 PM, John de la Garza  wrote:

> I found a bug that appears to be simple to fix.  I assume I am missing
> something.
>
> here is a link to the bug description:
> https://bugzilla.kernel.org/show_bug.cgi?id=12665
>
> the man page for the function in the bug report mentions that linux does
> not impliment the desired functionality
>
>
> It seems like it is accepted as working the way it does, and at the same
> time it is reported in bugzilla as a current bug.
>
>
> What am I missing?
>
>
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: help in developing soft and hardlockup detection tool

2013-12-15 Thread Peter Teoh

i think the logic is not possible and does not make sense.   essentially
you cannot disable interrupt and loop for 11 seconds and reenable interrupt
after that.this is because the timer is not going to trigger you once
the interrupt is disabled.   but u can of course do some pre-calculation:
for your CPU, for platform, do a precise low level accurate timing of CPU
to assess how many instructions of a certain types is need to achieve a
duration, say 1 microsecond.   then you implement a deterministic loop of 1
million loop to exactly implement a timing delay of 1 second for ONE cpu.
you can disable interrupt before entering that deterministic loop.   and
once out of loop, u can enable interrupt again.

the whole operation has to be precisely calculated and extrapolated from
microseconds to seconds, and it really varies from CPU to CPU, or even same
CPU in different platform.   and btw, normal kernel operation is always
with interrupt enabled, so all performance timestamping measurement will be
very different in your constraint of disabling interrupt, which u are
trying to do to simulate hardlockup.

On Sun, Dec 15, 2013 at 4:27 PM, Vipul Jain  wrote:

> Hi,
>
> I would like to write a kernel module that will induce the softlockup and
> hardlockup on the cpu core(s). Below is my logic and was wondering if some
> one can help me verify and guide me creating a thread and other stuff for
> implementing the logic.
>
> softlockup:
> on given cpu number.
> 1. disable kernel preemption
> 2. keep looping for 21 seconds (as per kernel Documentation it takes 20
> seconds to detect and I would like to recover the system once its detected).
> 3. release the cpu
>
> hardlockup
> on given cpu number.
> 1. disable interrrupts.
> 2. keep looping for 11 seconds.
> 3. enable interrupts and release cpu.
>
> Regards,
> Vipul.
>
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: network register on /proc fs

2013-12-15 Thread Peter Teoh

doing a "strace ifconfig -a" and you can see the following in your stderr:

open("/proc/net/dev", O_RDONLY) = 6
open("/proc/net/if_inet6", O_RDONLY)= 6
open("/proc/net/if_inet6", O_RDONLY)= 6
open("/proc/net/if_inet6", O_RDONLY)= 6

and see fd id is 6, you can also see the fd in your /proc//fd:

ls -al /proc/fd for a particular process give:

>ls -al /proc/2187/fd
total 0

lr-x-- 1 xxx xxx 64 Dec 16 07:26 0 -> /dev/null
l-wx-- 1 xxx xxx 64 Dec 16 07:26 1 -> /dev/null
lrwx-- 1 xxx xxx 64 Dec 16 07:26 10 -> anon_inode:[eventfd]
lrwx-- 1 xxx xxx 64 Dec 16 07:26 11 -> /dev/dri/card0

registration of /proc happened when /proc is being initialized and created:

fs/proc/root.c:proc_root_init()

and the function for network is proc_net_init() inside fs/proc/proc_net.c.

On Sat, Dec 7, 2013 at 12:58 PM, Hatt Tom  wrote:

> hi:
>
>   Does the network relevent register on  /proc fs ? when does it register ?
>
>   which entry of /proc will be used by ifconfig" command ?
>
>
> Thanks!
> --
> Best Regards!
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Should I pass user-space buffer pointer to read() of struct file implemented by `filp_open()`?

2013-12-05 Thread Peter Teoh

On Wed, Nov 27, 2013 at 9:57 PM, 乃宏周  wrote:

> In module code:
>
> *unsigned char buf[20];*
>
> *struct file *device;*
>
> *device = filp_open(...);*
>
> *device->f_op->read(device,buf,20,&device->f_pos);*
>
> In signature(interface) of *read()* of *struct file*, *buf* should came
> from user-space. I fed my buffer, and I get correct data from that, Is that
> correct? Shouldn't I provide a user-space buffer to that ?
>

Some convention in kernel programming:

long do_sys_open(int dfd, const char __user *filename, int flags, umode_t
mode)
{

here __user is used for declaration - explicitly saying that the pointer is
pointing to userspace data.

without it, all pointer necessarily need to point to kernel allocated
memory, and u used copy_from_user() to copy data from userspace to kernel
pointer.

> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: about cheating upper layers

2013-12-05 Thread Peter Teoh

what you said about ip_rcv() should work, since it is even earlier than
tcp, so no problem.

your ip address translation is exactly what NAT is doing, so u definitely
can remap it to what address u like.

but port + IP address does not imply anything about which NIC port it is
coming from.   perhaps the L2/MAC layer can provide the redirectionthis
part I am not sure.



On Thu, Dec 5, 2013 at 1:41 AM, Guibin(Bill) Tian  wrote:

> Thanks Peter for your explanation.  But in fact, I am not going to touch
> transport layer. The work shall be done inside the call stack of ip_rcv().
>
> In ip layer, there is no specific process information, so the process
> assignment shouldn't be a problem.
> At the application layer, each application maintains its own socket pair.
>
> My concern is that if the packet is from another NIC rather than the one
> used to make the connection, can I make it transparent to the application
> by only modifying the source and destination address in the ip header?
>
>
>
> On Wed, Dec 4, 2013 at 12:01 PM, Peter Teoh wrote:
>
>>
>>
>>
>> On Wed, Dec 4, 2013 at 7:49 PM, Peter Teoh wrote:
>>
>>>
>>>
>>>
>>> On Mon, Dec 2, 2013 at 1:48 PM, Guibin(Bill) Tian wrote:
>>>
>>>> Hi there,
>>>> Right now, I am trying to do such a thing.
>>>>
>>>> If a computer has multiple interface A and B, assume the packet is from
>>>> device A.
>>>> At ip layer, before the packet is transmitted to transport layer, I
>>>> change the source address and destination address of the IP header and
>>>> transmit to transport layer to pretend that this packet is from device B.
>>>> Not sure whether this can work or not.
>>>>
>>>
>>> feasible?   yes, theoretically u can change IP address, but there is a
>>> problem.   TCP port + IP address is combined together to uniquely identify
>>> which "socket" to pass to.   And each socket is always associated with each
>>> process.   if u change that the packet will be redirected to another
>>> process.   (this process context identification is done in the upper layer
>>> of TCP, ie, in interrupt context the packet has not been associated with
>>> any process yet.)
>>>
>>> and remember there is a checksum (TCP and IP) that need to be patched
>>> whenever u change anything.
>>>
>>> not sure why u want to that, but I suspect Netfilter should fulfill your
>>> requirement as well?
>>>
>>>
>>
>> Sorry, not sure if the earlier explanation is clear to you?
>>
>> So to be specific:
>>
>> in net/ipv4/tcp_ipv4.c:
>>
>> This is from the ingress path (still executing in software interrupt
>> context mode, ie, packet has not been assigned to any process yet, and so
>> you can always modify the packet content):
>>
>> int tcp_v4_rcv(struct sk_buff *skb)
>> {
>> const struct iphdr *iph;
>> const struct tcphdr *th;
>> struct sock *sk;
>> int ret;
>> struct net *net = dev_net(skb->dev);
>>
>> if (skb->pkt_type != PACKET_HOST)
>> goto discard_it;
>>
>> /* Count it even if it's bad */
>> TCP_INC_STATS_BH(net, TCP_MIB_INSEGS);
>>
>> if (!pskb_may_pull(skb, sizeof(struct tcphdr))
>>
>> and in include/net/sock.h:
>>
>> /* This is the per-socket lock.  The spinlock provides a synchronization
>>  * between user contexts and software interrupt processing, whereas the
>>  * mini-semaphore synchronizes multiple users amongst themselves.
>>  */
>> typedef struct {
>> spinlock_t  slock;
>> int owned;
>> wait_queue_head_t   wq;
>> /*
>>
>> To modify the packet, you can either do it before the above function, or
>> after the function, but before the packet gtet assigned to its rightful
>> owner.   MY GUESS*
>>
>>
>>>
>>>> There are other interface specific information in the skb structure
>>>> like the net_device member. If I pass the packet to transport layer only
>>>> with my proposed modification, will the application's sockehit detect this?
>>>> I didn't look into the socket match code, not sure if the socket match will
>>>> check other information in skb struct beside the ip address and port 
>>>> number.
>>>>
>>>> Thanks for your help.
>>>>
>>>> Bill
>>>>
>>>> ___
>>>> Kernelnewbies mailing list
>>>> Kernelnewbies@kernelnewbies.org
>>>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Peter Teoh
>>>
>>
>>
>>
>> --
>> Regards,
>> Peter Teoh
>>
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: watchdog pet in kernel module

2013-12-04 Thread Peter Teoh

On Thu, Dec 5, 2013 at 10:19 AM, Rajat Sharma  wrote:

> Although /dev/watchdog is available in usermode, but nothing should stop
> you to write to it from a kernel thread.
>
> Rajat
>

I don't think /dev/watchdog (literally, I meant) is available in the
kernel.   It is accessible in userspace, but translated to a different name
in the kernel.   and moreover, if u access the variable directly, bypassing
all the spinlock (see drivers/watchdog and look for "wdt_lock" spinlock)
that is implemented around it, u might be going into a  racing condition.

BUT.if u really insist probing from inside the kernelit is not
watchdog, it is "process watch", in your own way.

ie, u can always write a loop that periodically probe the status of that
specific to make sure it is in RUNNING state (vs BLOCKING when it is
waiting for some I/O, or locks to complete), and perhaps check the CPU
instruction to make sure that it is not going into a tight loop (ie, a
userspace program that literally do "while(true) {do_nothing()}and many
other possible "hung" criteria for a process as well.   not easy...but
extremely complex.


>
>
> On Wed, Dec 4, 2013 at 5:50 PM, Peter Teoh wrote:
>
>>
>>
>>
>> On Thu, Dec 5, 2013 at 9:06 AM, Vipul Jain  wrote:
>>
>>>
>>>
>>>
>>> On Wed, Dec 4, 2013 at 4:57 PM,  wrote:
>>>
>>>> On Wed, 04 Dec 2013 16:45:44 -0800, Vipul Jain said:
>>>>
>>>> > If you don't mind can you please provide me more insight as what can
>>>> be
>>>> > false alarm I can encounter to move pet inside kernel module?
>>>>
>>>> The issue isn't false alarms - it's failure to alarm when it should.
>>>>
>>>> The problem is that it's possible for a kernel to get wedged in such a
>>>> way that
>>>> a kernel thread is still able to feed the watchdog timer on a regular
>>>> basis,
>>>> but userspace is effectively hung and unable to proceed.  For example,
>>>> if an
>>>> OOPS happens while a filesystem lock is held, all future userspace
>>>> references
>>>> to that filesystem (and possibly all filesystems of the same type) will
>>>> hang,
>>>> eventually strangling the box while the kernel is still perfectly able
>>>> to keep
>>>> the watchdog working.
>>>>
>>>> Hi Valdis,
>>>
>>> I see what you are saying but what if the user process that's feeding
>>> the dog gets hung and rest of the system is fine then it will bring the
>>> whole system down won't it? I basically want to avoid this?
>>>
>>>
>> Normally the process that feed the dog, is a simple process that JUST
>> periodically set the watchdog device descriptor.Yes, one main() with a
>> while loop just periodically resetting the descriptor.
>>
>> And so it is is not able to respond in time, by inference, OTHER PROCESS
>> must have hung.   In other system i saw there is a mother process that
>> monitor a few (not all) of its key child process  so perhaps one child
>> will have one variable to signal to the mother that it is running.   If not
>> responding in time, the mother will clean up everything and then purposely
>> not setting the watchdog, resulting in reboot.
>>
>>
>>> Regards,
>>> Vipul.
>>>
>>>
>>
>>
>> --
>> Regards,
>> Peter Teoh
>>
>> ___
>> Kernelnewbies mailing list
>> Kernelnewbies@kernelnewbies.org
>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>
>>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: watchdog pet in kernel module

2013-12-04 Thread Peter Teoh

On Thu, Dec 5, 2013 at 9:06 AM, Vipul Jain  wrote:

>
>
>
> On Wed, Dec 4, 2013 at 4:57 PM,  wrote:
>
>> On Wed, 04 Dec 2013 16:45:44 -0800, Vipul Jain said:
>>
>> > If you don't mind can you please provide me more insight as what can be
>> > false alarm I can encounter to move pet inside kernel module?
>>
>> The issue isn't false alarms - it's failure to alarm when it should.
>>
>> The problem is that it's possible for a kernel to get wedged in such a
>> way that
>> a kernel thread is still able to feed the watchdog timer on a regular
>> basis,
>> but userspace is effectively hung and unable to proceed.  For example, if
>> an
>> OOPS happens while a filesystem lock is held, all future userspace
>> references
>> to that filesystem (and possibly all filesystems of the same type) will
>> hang,
>> eventually strangling the box while the kernel is still perfectly able to
>> keep
>> the watchdog working.
>>
>> Hi Valdis,
>
> I see what you are saying but what if the user process that's feeding the
> dog gets hung and rest of the system is fine then it will bring the whole
> system down won't it? I basically want to avoid this?
>
>
Normally the process that feed the dog, is a simple process that JUST
periodically set the watchdog device descriptor.Yes, one main() with a
while loop just periodically resetting the descriptor.

And so it is is not able to respond in time, by inference, OTHER PROCESS
must have hung.   In other system i saw there is a mother process that
monitor a few (not all) of its key child process  so perhaps one child
will have one variable to signal to the mother that it is running.   If not
responding in time, the mother will clean up everything and then purposely
not setting the watchdog, resulting in reboot.


> Regards,
> Vipul.
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: watchdog pet in kernel module

2013-12-04 Thread Peter Teoh

On Thu, Dec 5, 2013 at 8:45 AM, Vipul Jain  wrote:

>
>
>
> On Tue, Dec 3, 2013 at 10:28 PM, Peter Teoh wrote:
>
>> Hi Vipul,
>>
>> I have seen this in a number of commercial software running on RHEL, and
>> on other realtime OS as well.   The watchdog mechanism is always working in
>> pair:   userspace "feeding" the dog (in the kernel).   (btw, feed the dog
>> is a more usually used term than "pet" the dog.   sorry for that.   google
>> for that and perhaps you can get more info?).
>>
>> Like Valdis said, this way you will know when userspace hang, which is
>> the key criteria for reboot.   Why do u want to detect if the kernel hang
>> (versus busy doing something)?   Theoretically that is not possible,
>> especially when all interrupt are disabled.
>>
>>>
>>> Hi Peter,
>
> If you don't mind can you please provide me more insight as what can be
> false alarm I can encounter to move pet inside kernel module?
>
>
"Feeding the dog" is simply a periodic timer that wakes up and set a
variable.   By the fact that the variable can be set/reset, also means that
the periodic timer IS working.   In userspace, if you just have one process
to "feed the watchdog", then essentially we are monitoring whether
system-wide the performance overall is good enough so that the periodic
timer can be woken up at the required interval to reset the variable.   If
some process hung, it MAY or MAY not affect the periodicity of this timer
process.

But if you have the timer embedded inside a particular high priority
process you want to monitor, and if it hung, and "feeding the watchdog"
will not execute, and the kernel will reboot you (read below - search
"reboot").

http://www.mjmwired.net/kernel/Documentation/watchdog/watchdog-api.txt

and more insights:

http://stackoverflow.com/questions/2020468/who-is-refreshing-hardware-watchdog-in-linux

(and lots of the "RELATED" questions at the side of the above page as well.)



> Regards,
> Vipul.
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: How can I 'getchar()' in module code?

2013-12-04 Thread Peter Teoh

yes, exactly - what u are describing is called "kdb".   don't mixed up with
"kgdb".

kdb:   this is debugging on the same computer - so no serial ports
connection are needed.   once exception occurred, you will be popped into a
special debugger screen.   problem is that now this debugger is running in
kernel mode, inside the same computer that have the kernel module crashing,
and so everything stop running, only kdb is running.

(NOTE:   i played with this almost like 8 or 9 years ago, and it seemed now
kdb is not updated any more.)

kgdb:   this always require TWO computer:   host + debuggee.   kgdb is
running inside the debuggee whose kernel has crashed, and gdb is running in
host.   normally connected via serial port.   normally the preferred way is
to run the kernel to be debugged inside the VirtualBox, or VMWare, and then
gdb host is the virtual machine host.

diff between the two is explained here:

https://www.kernel.org/pub/linux/kernel/people/jwessel/kdb/CompileKDB.html

and setup are here (mainly for kgdb):

http://elinux.org/KDB
http://allmybrain.com/2010/04/29/debugging-linux-kernel-modules-with-virtualbox-and-kgdb/
http://www.linuxforu.com/2011/03/kgdb-with-virtualbox-debug-live-kernel/

have fun.

On Tue, Dec 3, 2013 at 8:35 PM, 乃宏周  wrote:

> For debugging purpose, I want something like 'getchar()' that can pause
> execution in the module code. Do any candidates I can choose?
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: about cheating upper layers

2013-12-04 Thread Peter Teoh

On Wed, Dec 4, 2013 at 7:49 PM, Peter Teoh  wrote:

>
>
>
> On Mon, Dec 2, 2013 at 1:48 PM, Guibin(Bill) Tian wrote:
>
>> Hi there,
>> Right now, I am trying to do such a thing.
>>
>> If a computer has multiple interface A and B, assume the packet is from
>> device A.
>> At ip layer, before the packet is transmitted to transport layer, I
>> change the source address and destination address of the IP header and
>> transmit to transport layer to pretend that this packet is from device B.
>> Not sure whether this can work or not.
>>
>
> feasible?   yes, theoretically u can change IP address, but there is a
> problem.   TCP port + IP address is combined together to uniquely identify
> which "socket" to pass to.   And each socket is always associated with each
> process.   if u change that the packet will be redirected to another
> process.   (this process context identification is done in the upper layer
> of TCP, ie, in interrupt context the packet has not been associated with
> any process yet.)
>
> and remember there is a checksum (TCP and IP) that need to be patched
> whenever u change anything.
>
> not sure why u want to that, but I suspect Netfilter should fulfill your
> requirement as well?
>
>

Sorry, not sure if the earlier explanation is clear to you?

So to be specific:

in net/ipv4/tcp_ipv4.c:

This is from the ingress path (still executing in software interrupt
context mode, ie, packet has not been assigned to any process yet, and so
you can always modify the packet content):

int tcp_v4_rcv(struct sk_buff *skb)
{
const struct iphdr *iph;
const struct tcphdr *th;
struct sock *sk;
int ret;
struct net *net = dev_net(skb->dev);

if (skb->pkt_type != PACKET_HOST)
goto discard_it;

/* Count it even if it's bad */
TCP_INC_STATS_BH(net, TCP_MIB_INSEGS);

if (!pskb_may_pull(skb, sizeof(struct tcphdr))

and in include/net/sock.h:

/* This is the per-socket lock.  The spinlock provides a synchronization
 * between user contexts and software interrupt processing, whereas the
 * mini-semaphore synchronizes multiple users amongst themselves.
 */
typedef struct {
spinlock_t  slock;
int owned;
wait_queue_head_t   wq;
/*

To modify the packet, you can either do it before the above function, or
after the function, but before the packet gtet assigned to its rightful
owner.   MY GUESS*


>
>> There are other interface specific information in the skb structure like
>> the net_device member. If I pass the packet to transport layer only with my
>> proposed modification, will the application's sockehit detect this? I
>> didn't look into the socket match code, not sure if the socket match will
>> check other information in skb struct beside the ip address and port number.
>>
>> Thanks for your help.
>>
>> Bill
>>
>> ___
>> Kernelnewbies mailing list
>> Kernelnewbies@kernelnewbies.org
>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>
>>
>
>
> --
> Regards,
> Peter Teoh
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: about cheating upper layers

2013-12-04 Thread Peter Teoh

On Mon, Dec 2, 2013 at 1:48 PM, Guibin(Bill) Tian  wrote:

> Hi there,
> Right now, I am trying to do such a thing.
>
> If a computer has multiple interface A and B, assume the packet is from
> device A.
> At ip layer, before the packet is transmitted to transport layer, I change
> the source address and destination address of the IP header and transmit to
> transport layer to pretend that this packet is from device B. Not sure
> whether this can work or not.
>

feasible?   yes, theoretically u can change IP address, but there is a
problem.   TCP port + IP address is combined together to uniquely identify
which "socket" to pass to.   And each socket is always associated with each
process.   if u change that the packet will be redirected to another
process.   (this process context identification is done in the upper layer
of TCP, ie, in interrupt context the packet has not been associated with
any process yet.)

and remember there is a checksum (TCP and IP) that need to be patched
whenever u change anything.

not sure why u want to that, but I suspect Netfilter should fulfill your
requirement as well?

>
> There are other interface specific information in the skb structure like
> the net_device member. If I pass the packet to transport layer only with my
> proposed modification, will the application's sockehit detect this? I
> didn't look into the socket match code, not sure if the socket match will
> check other information in skb struct beside the ip address and port number.
>
> Thanks for your help.
>
> Bill
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Recovering Linux system from hung state via software

2013-12-04 Thread Peter Teoh

On Wed, Dec 4, 2013 at 4:13 PM, Mandeep Sandhu
wrote:

> > assuming one mother process is monitoring 10 child process, so inside
> each
> > child process, simply just setup a PERIODIC (eg, per 5 sec) mechanism to
> > toggle a binary variables through IPC means.   It will be reset when the
> > mother process go around checking all the variable status and, if not
> reset
> > it therefore implies that the particular process might be hung.it can
> > wait further, or continue checking other process.   at the end of
> checking
> > ALL the process, if everything is OK, it should feed the kernel watchdog
> > timer.   if the kernel watchdog timer is not reset, the kernel module
> will
> > then reboot the system.   (ie, reboot is from kernel module).
>
> Hold on! Why should we reboot the whole system if only some of these
> processes are misbehaving?!?! Why should other processes suffer due
> this? Wouldn't it be better to just kill the erroneous process (like
> how most OS's anyway do, eg: "Force Quit" in Ubuntu, or chrome tabs).
>
>
In many COTS software, the behavior of every process is highly dependent on
one-another, especially some of these will talk to hardware, and other are
just processing the intermediate data.   When something goes wrong, it is
difficult to diagnose the faults (which is why faults logging is important,
and always done on flash or harddisk, but not temporary filesystem) in
realtime (ie, self-diagnosis mechanism), so it is better to reboot.   yes,
not all process need to trigger reboot, so design it with care.   eg,
Apache server can always afford to be kill and restart a new one.


> Or are these processes the only ones running on the system?
>
> -mandeep
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Recovering Linux system from hung state via software

2013-12-04 Thread Peter Teoh

On Fri, Nov 29, 2013 at 8:28 AM, Vipul Jain  wrote:

> Hi Kernel alias,
>
> I am a newbie and I am trying to figure out ways where in I can recover the
> Linux in below two scenarios:
> 1. my specific process hangs.
>

how to recover i cannot tell you, because it is application specific (but
best is to design your system to reboot completely.   eg temporary stuff or
files should be stored in memory - eg, tmpfs, and rebooting will be all
"gone", not erased and removed securely, but logically "gone").   And how
to detect that is this:

assuming one mother process is monitoring 10 child process, so inside each
child process, simply just setup a PERIODIC (eg, per 5 sec) mechanism to
toggle a binary variables through IPC means.   It will be reset when the
mother process go around checking all the variable status and, if not reset
it therefore implies that the particular process might be hung.it can
wait further, or continue checking other process.   at the end of checking
ALL the process, if everything is OK, it should feed the kernel watchdog
timer.   if the kernel watchdog timer is not reset, the kernel module will
then reboot the system.   (ie, reboot is from kernel module).

>
> 2. kernel gets hung partially or completely.
>
> I have done some reading and seems like there is softlockup and hardlockup
> mechanisms in Linux source base that I can use but not sure, if yes I have
> below questions:
> 1. Which kernel version is minimum required for this?
> 2. How do I know that soft and hard lockup are enabled in my kernel?
> 3. How can I customize the behavior of default action that been taken?
> 4. Can I use these two lockup mechanism to find out if my process is hung
> or not?
> 5. Any pointers to any docs that can help will be appreciated.
>
> I will greatly appreciate any help here.
>
>
> Regards,
> Vipul.
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: watchdog pet in kernel module

2013-12-03 Thread Peter Teoh

Hi Vipul,

I have seen this in a number of commercial software running on RHEL, and on
other realtime OS as well.   The watchdog mechanism is always working in
pair:   userspace "feeding" the dog (in the kernel).   (btw, feed the dog
is a more usually used term than "pet" the dog.   sorry for that.   google
for that and perhaps you can get more info?).

Like Valdis said, this way you will know when userspace hang, which is the
key criteria for reboot.   Why do u want to detect if the kernel hang
(versus busy doing something)?   Theoretically that is not possible,
especially when all interrupt are disabled.

On Wed, Dec 4, 2013 at 6:45 AM, Vipul Jain  wrote:

>
>
>
> On Tue, Dec 3, 2013 at 2:31 PM,  wrote:
>
>> On Tue, 03 Dec 2013 13:15:32 -0800, Vipul Jain said:
>>
>> > currently we configure/pet the watchdog from user space via /dev/ipmi0
>> > device interface and I would like to do the pet part from kernel module.
>>
>> That's actually defeating the purpose.  If you do it from the kernel,
>> you keep the watchdog from detecting a whole set of hangs that can cause
>> userspace to wedge up.
>>
>
> Well we use different mechanism to detect user space hangs and take
> corrective actions. Hence we want to separate the user space issues from
> kernel space issues by using hardware watchdog pet in kernel space.
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Getting struct page pointer from virtual address

2013-09-12 Thread Peter Teoh

I think there is no exported function for that, but there is a global
variable for that.   Reason being for performance - the action
virt_to_phys() is a macro to be compiled inline and more details here:

http://stackoverflow.com/questions/5982125/how-to-get-a-struct-page-from-any-address-in-the-linux-kernel


On Wed, Sep 4, 2013 at 1:29 AM, ajay saini wrote:

> More information :
> - Linux kernel version : 2.6.32 (But I would like a method which is
> portable to other higher versions as well)
> - I tried using follow_page, but this function is not exported from the
> kernel so, can't use it. (Any reason why this function is not exported??)
>
> Thanks
> Ajay
>
>   --
>  *From:* ajay saini 
> *To:* "kernelnewbies@kernelnewbies.org" 
> *Sent:* Tuesday, 3 September 2013 1:21 PM
> *Subject:* Getting struct page pointer from virtual address
>
> Hey,
>
> I am working on a linux kernel module and I have a virtual address and mm
> (struct mm_struct) for a process in this module. I can find the virtaul
> memory area to which this address belongs to by using find_vma.
>
> Is there a function in the linux kernel which I can use in this module
> (i.e. exported from the kernel) to get struct page pointer for this virtual
> address.
>
> Thanks
> Ajay
>
>
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Filesystem and files getting corrupted

2013-07-03 Thread Peter Teoh

any kernel debugging always start with dmesg output, please provide a
snapshot of that.   (preferably posting full listing at pastebin.com).

On Tue, Jun 25, 2013 at 4:08 AM, Daniel Hilst Selli
wrote:

> I'm working on an embedded project based on var-som-am35 from TI. [1]
>
> I experiencing a lot of corruption from files and even the entire
> filesystem... is there any guide on how debug filesystems corruption?
>
> We already tryied vfat and ext3 fs.. changed media, changed machines...
> The filesystem runs on mmc card, or on usb flash drive... There is a
> java aplication running on top of this filesystem, which uses JMS, that
> is very I/O agressive..
>
> Cheers,
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Why is that the write speed of DDR SDRAM are faster than the DDR2

2013-06-04 Thread Peter Teoh

these are tradeoff between DDR2 (for higher speed, but higher read
latencies) vs DDR (lower speed + read latencies).

so if u make the bus speed the same for both, then DDR2's higher latencies
will make it slower than DDR.

http://www.diffen.com/difference/DDR_vs_DDR2

and a technical comparison charts in numbers:

http://www.freescale.com/webapp/sps/site/overview.jsp?code=784_LPBB_DDR


On Mon, Jun 3, 2013 at 2:39 PM, devendra.aaru wrote:

> Hello,
>
> I have two different types of hardware, one with DDR SDRAM and another
> DDR2,
> i tested them with bw_mem tool for write bandwidths, seems that at
> higher writes of  >512kbytes the DDR is faster than the DDR2(more than
> 40%). But when compared to reads, DDR is slower (more than 50%). I
> couldn't find any reference that explains why. AFAIK, the DDR2 works
> double the faster rate than the DDR.
>
>
> any ideas?
>
> Thanks,
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Analyzing Kernel call traces.

2013-05-17 Thread Peter Teoh

On Wed, May 8, 2013 at 3:16 PM, Shraddha Kamat  wrote:

> Any good tutorial for analyzing kernel call traces ? I want to
> know what is the meaning of everything that appears in the call
> trace and get to the exact cause of the problem.
>

sorry , u mean "backtrace" call trace?   or kernel oops?

http://www.linuxforu.com/2011/01/understanding-a-kernel-oops/

and here is another trace:

http://elinux.org/Kernel_Function_Trace

which depends on the instrumentation method:

http://elinux.org/images/6/68/Kfiboot-9.lst

http://elinux.org/Kernel_Instrumentation

http://elinux.org/Instrumentation_API

many of these traces, simply depends on the concept of call frames, or a
range of memory  addresses allocated on the stack used by the functions.

above page also mentioned the use of gcc -pg, and not mentioned are other
features of gcc (man gcc):

   -finstrument-functions
   -finstrument-functions-exclude-function-list=sym,sym,...
   -finstrument-functions-exclude-file-list=file,file,...

Beware though, sometimes compilation will explicitly remove the use of
frame pointer:

-fomit-frame-pointer

without the "ebp" and "esp" to demarcate the start and end of a frame,
there is no way to know the beginning and end of a call frame, and
therefore "stack trace", or "call trace" will not be accurately shown.
Other possibilities are that the function names are declared with "static"
as well, and u will end up with numerical offset from the nearest function
with name.

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: atomic operations

2013-02-24 Thread Peter Teoh

in simple terms, any operation, in terms assembly instructions, which can
be executed in ONE instruction, is "atomic", because, just like an atom, it
cannot be broken up into parts.   any instructions that is longer than one,
for eg, TWO instruction, is NOT atomic, because in BETWEEN the first and
2nd instruction, something like an interrupt can come in, and affect the
values of the operand when it is passed from instruction one to second
instruction.  To save me from reiteration:

http://www.ibm.com/developerworks/library/pa-dalign/ (search for
"atomicity").

http://stackoverflow.com/questions/381244/purpose-of-memory-alignment

http://lwn.net/Articles/260832/

http://www.songho.ca/misc/alignment/dataalign.html

http://www.cis.upenn.edu/~palsetia/cit595s08/Lectures08/alignmentOrdering.pdf

Essentially, atomicity and non-alignment become problematic when u tried to
to read using non-byte addressing mode with non-aligned address.

On Sun, Feb 24, 2013 at 5:42 PM, Shraddha Kamat  wrote:

> what is the relation between atomic operations and memory alignment ?
>
> I read from UTLK that "an unaligned memory access is not atomic"
>
> please explain me , I am not able to get the relationship between
> memory alignment and atomicity of the operation.
>
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: atomic operations

2013-02-24 Thread Peter Teoh

Another good article on atomicty and data sizes:

http://www.ibm.com/developerworks/library/pa-atom/

On Sun, Feb 24, 2013 at 8:50 PM, Peter Teoh  wrote:

> in simple terms, any operation, in terms assembly instructions, which can
> be executed in ONE instruction, is "atomic", because, just like an atom, it
> cannot be broken up into parts.   any instructions that is longer than one,
> for eg, TWO instruction, is NOT atomic, because in BETWEEN the first and
> 2nd instruction, something like an interrupt can come in, and affect the
> values of the operand when it is passed from instruction one to second
> instruction.  To save me from reiteration:
>
> http://www.ibm.com/developerworks/library/pa-dalign/ (search for
> "atomicity").
>
> http://stackoverflow.com/questions/381244/purpose-of-memory-alignment
>
> http://lwn.net/Articles/260832/
>
> http://www.songho.ca/misc/alignment/dataalign.html
>
>
> http://www.cis.upenn.edu/~palsetia/cit595s08/Lectures08/alignmentOrdering.pdf
>
> Essentially, atomicity and non-alignment become problematic when u tried
> to to read using non-byte addressing mode with non-aligned address.
>
> On Sun, Feb 24, 2013 at 5:42 PM, Shraddha Kamat wrote:
>
>> what is the relation between atomic operations and memory alignment ?
>>
>> I read from UTLK that "an unaligned memory access is not atomic"
>>
>> please explain me , I am not able to get the relationship between
>> memory alignment and atomicity of the operation.
>>
>>
>> ___
>> Kernelnewbies mailing list
>> Kernelnewbies@kernelnewbies.org
>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>
>
>
>
> --
> Regards,
> Peter Teoh
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: V4L2 Framework

2013-02-23 Thread Peter Teoh

es in the
kernel source:

mem2mem_testdev.cv4l2-mem2mem.c

and all the APIs u can use (inside kernel drivers) are listed above (as
EXPORT symbol).


On Mon, Feb 18, 2013 at 12:29 PM, Kaushal Billore <
kaushalbill...@hotmail.com> wrote:

> I have some doubt regarding Linux kernel V4l2 API's.
> When capture application calls Reqbuff ioctl to allocate n no of buffer
> which would belongs to v4l2 layer and display application calls the Reqbuff
> ioctl to allocate N no of buffer which would also belongs to device memory.
>
> Question:
> 1. V4l2 maintains the generic layer for all devices in which buffers can
> be allocated by any device and can be handle by any device?
>
> 2. If not then while capturing the data from capture device can capture
> device allocated buffer gets filled and while displaying the same data
> there memory copy happens between capture buffer and output buffers?
>
> 3. If not then I want to capture data from capture device and display onto
> display device through the v4l2 framework layer.
>
> Awaiting for responce!
>
> Thanks in advance
> Kaushal
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: How controll is passed from uboot to kernel

2013-02-19 Thread Peter Teoh

Reading the uboot source code:

In common/cmd_bootm.c:

int do_bootm (cmd_tbl_t *cmdtp, int flag, int argc, char *argv[])
{

And within this do_bootm_linux() is called:

do_bootm_linux  (cmdtp, flag, argc, argv,
 addr, len_ptr, verify);

And inside do_bootm_linux() (platform-specific, for x86 it is
lib_i386/i386_linux.c) is the load_zimage() function being called, which is
effectively loading the kernel image file.

On Sat, Feb 16, 2013 at 12:43 PM, Chetan C.R.  wrote:

> Hi All,
>
> I need to know how the control is passed from u-boot to kernel in Linux
> operating system
>
>
> Thanks in Advannce
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: MAX limit of file descriptor

2013-02-12 Thread Peter Teoh

one more:

To modify system-wide limits:

 */etc/security/limits.conf*


On Tue, Feb 12, 2013 at 5:31 PM, Peter Teoh  wrote:

> perhaps i can add more info, after doing more investigation:
>
> a.   "ulimit" is a shell feature, it is not a command line binary.   "man
> bash" and "man sh" and u can see ulimit has different feature available for
> u.
>
> b.   ulimit control all the resources defined by the processes spawn from
> the current shell onwards...ie, once ulimit is change, all child processes
> from that shell onwards will change.  but resources limit in another shell,
> existing processes etc does not.
>
> c.   ulimit is a userspace feature, the kernel will have all the
> corresponding feature of max open files etc...but definitely it is not
> unlimited like that of ulimit.
>
> d.   to see ALL the open files u can use "lsof" and "-p" give u control to
> point at which process to dig for open files.   it also list all the open
> connections (TCP) for uwhich is what u want.
>
> e.   generally java applications will open many many files descriptor
> concurrently:
>
>
> http://www.java.net/forum/topic/glassfish/glassfish/too-many-open-files-issue
>
> (above listed 4500, and many others java apps like IBM RSA also have many).
>
> On Sat, Feb 9, 2013 at 1:10 PM, horseriver  wrote:
>
>> hi:)
>>
>>In one process ,what is the max number of opening file descriptor ?
>>Can it be set to infinite ?
>>
>>In network programing ,what is the essential for  the maximum of
>> connections
>>dealed per second
>>
>> thanks!
>>
>> _______
>> Kernelnewbies mailing list
>> Kernelnewbies@kernelnewbies.org
>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>
>
>
>
> --
> Regards,
> Peter Teoh
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: MAX limit of file descriptor

2013-02-12 Thread Peter Teoh

perhaps i can add more info, after doing more investigation:

a.   "ulimit" is a shell feature, it is not a command line binary.   "man
bash" and "man sh" and u can see ulimit has different feature available for
u.

b.   ulimit control all the resources defined by the processes spawn from
the current shell onwards...ie, once ulimit is change, all child processes
from that shell onwards will change.  but resources limit in another shell,
existing processes etc does not.

c.   ulimit is a userspace feature, the kernel will have all the
corresponding feature of max open files etc...but definitely it is not
unlimited like that of ulimit.

d.   to see ALL the open files u can use "lsof" and "-p" give u control to
point at which process to dig for open files.   it also list all the open
connections (TCP) for uwhich is what u want.

e.   generally java applications will open many many files descriptor
concurrently:

http://www.java.net/forum/topic/glassfish/glassfish/too-many-open-files-issue

(above listed 4500, and many others java apps like IBM RSA also have many).

On Sat, Feb 9, 2013 at 1:10 PM, horseriver  wrote:

> hi:)
>
>In one process ,what is the max number of opening file descriptor ?
>Can it be set to infinite ?
>
>In network programing ,what is the essential for  the maximum of
> connections
>dealed per second
>
> thanks!
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: MAX limit of file descriptor

2013-02-10 Thread Peter Teoh

On Sun, Feb 10, 2013 at 8:29 PM,
wrote:

> Hi!
>
> On 13:10 Sat 09 Feb , horseriver wrote:
> > hi:)
> >
> >In one process ,what is the max number of opening file descriptor ?
>
> Type "ulimit -a" in your shell. On my system (debian) the default is 1024.
>

Hi Michael, nice to see u again.

BTW, many of the parameters as reported by ulimit, also has to be taken
with some doubts:

ulimit -a
core file size  (blocks, -c) unlimited
data seg size   (kbytes, -d) unlimited
scheduling priority (-e) 0
file size   (blocks, -f) unlimited
pending signals (-i) 47543
max locked memory   (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files  (-n) 1024
pipe size(512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority  (-r) 0
stack size  (kbytes, -s) 8192
cpu time   (seconds, -t) unlimited
max user processes  (-u) 47543
virtual memory  (kbytes, -v) unlimited
file locks  (-x) unlimited

the above is for Ubuntu 12.04 with 32-bit kernel (3.2.0) but of course we
know that max file size has a limit - depending on whether it is ext2 or
ext3 or ext4.   cannot remember the exact nos, but general conceptual
level, there is a limit.

even for "CPU time"...it is limited by the underlying bit length of
representation for time.   as usual...i don't know the details :-(, just
concept.   sorry :-(.



>
> >Can it be set to infinite ?
>
> Maybe, but at least it can be set very high.
>
> >In network programing ,what is the essential for  the maximum of
> connections
> >dealed per second
>
> - Use non blocking i/o and epoll(). Do *not* create 1 process/thread for
> each
>   connection and do not use use select().
> - Obviously, the more memory your application uses, the more memory has to
> be
>   put in the server. IIRC, 1 tcp connection uses ~1kb kernel memory.
> - The same applies for cpu time. On the system side, you may want to
> recommend
>   network adaptors which can be switched to polling instead of raising 1
>   interrupt per packet. You should expect to see lots of small packets on
> the
>   network.
>
> -Michi
> --
> programing a layer 3+4 network protocol for mesh networks
> see http://michaelblizek.twilightparadox.com
>
> _______
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: printk question:why release console_sem after logbuf_lock

2013-02-09 Thread Peter Teoh

Details u have to look at the source code, my guess is based on the
following posting:

http://kerneltrap.org/mailarchive/linux-kernel/2008/1/23/595569

https://lkml.org/lkml/2011/6/21/249

logic:

a.  u want to be able to do printk from anywhere.
b.  but every call to printk requires a console_sem lock to be acquired.
after acquiring console_sem, printk actually serializes the output to a
memory buffer.
c.  now problem arises when printk is happening very fast, and so this type
of locks is ill-suited for printk().
d.   later than this patch is another attempt:

https://patchwork.kernel.org/patch/1760211/
https://lkml.org/lkml/2012/10/20/90

where lazy irq work is being used instead.

read through the comments in the intro to the patch - it covers a lot more
than i mentioned here.

In Documentation/lockdep_design.txt discuss about using irq tracing to
trace the lock dependencies.

Lock inversion is a common computer science problemlook up wiki.

On Sat, Feb 9, 2013 at 10:55 AM, buyitian  wrote:

>   in the patch 0b5e1c5255e7ee8670e077e8224e5c2281229a5b, it releases
> console_sem after logbuf_lock,  the description of this patch is as below:
>
> Release console_sem after unlocking the logbuf_lock so that we don't
> generate wakeups while holding logbuf_lock. This avoids some lock
> inversion troubles once we remove the lockdep_off bits between
> logbuf_lock and rq->lock (prints while holding rq->lock vs doing
> wakeups while holding logbuf_lock).
> There's of course still an actual deadlock where the printk()s under
> rq->lock will issue a wakeup from the up() call, but lockdep won't
> warn about that since semaphores are not tracked.
>
>   could you please give me a detail example about the issue it tries
> to fix? thanks.
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Kernel code interrupted by Timer

2013-02-09 Thread Peter Teoh

On Sun, Feb 10, 2013 at 12:22 AM, Frederic Weisbecker wrote:

> 2013/2/9 Peter Teoh :
> > A search in the entire subtree of arch/x86/ and including all its
> > subdirectories, (for 3.2.0 kernel) return only TWO result where
> > preempt_schedule_irq is called:   kernel/entry_64.S and
> kernel/entry_32.S.
> > And the called is in fact resume_kernel(),   ie, it is NOT called from
> timer
> > interrupt, but from wakeup context of the CPU, and is only executed ONCE
> > upon waking up from hibernation.
> >
> > for example, calling from here:
> >
> > https://lkml.org/lkml/2012/5/2/298
> >
> > so definitely this preempt_schedule_irq() calling from irq mode is rare
> - at
> > least for x86.
>
> The name "resume_kernel" can indeed sound like something that is
> called on hibernation resume. It's actually not related at all. It's a
> piece of code that is called at the end of every irq and exception
> when the interrupted code was running in the kernel. If the
> interrupted code was running in userspace, we jump to
> resume_userspace.
>

well, i guessed u must be the expert here, i have yet to really digest all
these...:-).   thanks for the explanation.

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Kernel code interrupted by Timer

2013-02-09 Thread Peter Teoh

27;s actually fine. Later on, the scheduler restores the previous task
> > to the middle of preempt_schedule_irq() and the irq completes its
> Sorry didn't understand this sentence i.e. "scheduler restores the
> previous task to the middle of preempt_schedule_irq()".
> > return to what it interrupted. The state of the processor prior to the
> > interrupt is stored on the task stack. So we can restore that anytime.
> > Note if the irq interrupted userspace, it can do about the same thing,
> > except it calls schedule() directly instead of preempt_schedule_irq().
> >
> > ___
> > Kernelnewbies mailing list
> > Kernelnewbies@kernelnewbies.org
> > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: MAX limit of file descriptor

2013-02-09 Thread Peter Teoh

i can only make a general statement, may not be always true/false:

in the kernel almost EVERYTHING HAS TO BE FINITEand this is cater for
the fact that

but at the userspace or application level, u can design structures to be
infinite.   eg, I used python for large number calculation, and so far it
has not limits, but I am sure at the representation level, there is
onebut because i don't know the datastructure used, i don't know the
limits.

On Sat, Feb 9, 2013 at 1:10 PM, horseriver  wrote:

> hi:)
>
>In one process ,what is the max number of opening file descriptor ?
>Can it be set to infinite ?
>
>In network programing ,what is the essential for  the maximum of
> connections
>dealed per second
>
> thanks!
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Kernel code interrupted by Timer

2013-02-09 Thread Peter Teoh

On Sat, Feb 9, 2013 at 4:20 PM, Peter Teoh  wrote:

>
>
> On Sat, Feb 9, 2013 at 3:51 PM, anish kumar 
> wrote:
>
>> On Sat, 2013-02-09 at 14:57 +0800, Peter Teoh wrote:
>> >
>> >
>> > On Sat, Feb 9, 2013 at 1:47 PM, anish kumar
>> > .
>> > Timer interrupts is supposed to cause scheduling and scheduler
>> > may or
>> > may not pick up your last process(we always use the term
>> > "task" in
>> > kernel space) after handling timer interrupt.
>> > >
>> >
>> >
>> >
>> > Sorry if I may disagree, correct me if wrong.   Timer interrupt and
>> > scheduler is two different thing.   I just counted in the "drivers"
>> > subdirectory, there are at least more than 200 places where
>> > "setup_timer()" is called, and these have nothing to do with
>> > scheduling.   For eg, heartbeat operation etc.  Not sure I
>> > misunderstood something?
>> Have a look at kernel/timer.c and kernel/hrtimer.c.
>> There are many sched() calls in these files.This will invoke scheduler.
>> >
>>
>
> kernel/timer.c and kernel/hrtimer.c are implementing the logic outside of
> timer interrupt context, ie, it is NOT executed in timer interrupt context,
> but in bottom half context.   the real timer interrupt context is done in
> arch-specific branch:  arch/x86/kernel/tsc.c, for example, and the entire
> tsc.c has no scheduling concept in it.   the entire file tsc.c in fact is
> handling all the hardware-specific stuff - in the top-half context.
>

one mistake here:  kernel/timer.c is running in bottom half interrupt
context, which is still in interrupt context/mode.but as I glanced
through the entire kernel/timer.c, there is no task scheduling called
anywhere in this file.   it is doing timer scheduling in fact.   whereas
the context switching we were discussing, that necessitate consistent state
maintenance, is done in task scheduling (inside kernel/sched.c).   of
course sometimes timer interrupt will trigger task scheduling logic
sometime, but it is not always..not sure if my statement is correct?
(no time to search the source, please pardon me).


>
> in linux kernel scheduling is done in two ways:  voluntary and involuntary
> scheduling.   involuntary scheduling means it is triggered by timer
> interrupt.   but voluntary scheduling (which is only recently introduced
> into kernel for performance reasons) drastically improve the latency
> numbers.voluntary scheduling is NOT triggered by timer, but ANYONE who
> want to give up the CPU can call sched_cpu() to do a rescheduling.
>
> hope i am not wrong.
>
>
>> >
>> > --
>> > Regards,
>> > Peter Teoh
>>
>>
>>
>
>
> --
> Regards,
> Peter Teoh




-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Kernel code interrupted by Timer

2013-02-09 Thread Peter Teoh

On Sat, Feb 9, 2013 at 3:51 PM, anish kumar wrote:

> On Sat, 2013-02-09 at 14:57 +0800, Peter Teoh wrote:
> >
> >
> > On Sat, Feb 9, 2013 at 1:47 PM, anish kumar
> > .
> > Timer interrupts is supposed to cause scheduling and scheduler
> > may or
> > may not pick up your last process(we always use the term
> > "task" in
> > kernel space) after handling timer interrupt.
> > >
> >
> >
> >
> > Sorry if I may disagree, correct me if wrong.   Timer interrupt and
> > scheduler is two different thing.   I just counted in the "drivers"
> > subdirectory, there are at least more than 200 places where
> > "setup_timer()" is called, and these have nothing to do with
> > scheduling.   For eg, heartbeat operation etc.  Not sure I
> > misunderstood something?
> Have a look at kernel/timer.c and kernel/hrtimer.c.
> There are many sched() calls in these files.This will invoke scheduler.
> >
>

kernel/timer.c and kernel/hrtimer.c are implementing the logic outside of
timer interrupt context, ie, it is NOT executed in timer interrupt context,
but in bottom half context.   the real timer interrupt context is done in
arch-specific branch:  arch/x86/kernel/tsc.c, for example, and the entire
tsc.c has no scheduling concept in it.   the entire file tsc.c in fact is
handling all the hardware-specific stuff - in the top-half context.

in linux kernel scheduling is done in two ways:  voluntary and involuntary
scheduling.   involuntary scheduling means it is triggered by timer
interrupt.   but voluntary scheduling (which is only recently introduced
into kernel for performance reasons) drastically improve the latency
numbers.voluntary scheduling is NOT triggered by timer, but ANYONE who
want to give up the CPU can call sched_cpu() to do a rescheduling.

hope i am not wrong.

> >
> > --
> > Regards,
> > Peter Teoh
>
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Kernel code interrupted by Timer

2013-02-08 Thread Peter Teoh

On Sat, Feb 9, 2013 at 1:47 PM, anish kumar
.
>
> Timer interrupts is supposed to cause scheduling and scheduler may or
> may not pick up your last process(we always use the term "task" in
> kernel space) after handling timer interrupt.
> >
>

Sorry if I may disagree, correct me if wrong.   Timer interrupt and
scheduler is two different thing.   I just counted in the "drivers"
subdirectory, there are at least more than 200 places where "setup_timer()"
is called, and these have nothing to do with scheduling.   For eg,
heartbeat operation etc.  Not sure I misunderstood something?

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Kernel code interrupted by Timer

2013-02-08 Thread Peter Teoh

On Sat, Feb 9, 2013 at 8:08 AM, Peter Teoh  wrote:

>
>
> On Sat, Feb 9, 2013 at 1:08 AM, Gaurav Jain wrote:
>
>> What happens if the kernel executing in some process context (let's say
>> executing a time-consuming syscall) gets interrupted by the Timer - which
>> is apparently allowed in 2.6 onwards kernels.
>>
>> My understanding is that once the interrupt handler is done executing, we
>> should switch back to where the kernel code was executing. Specifically,
>> the interrupt handler for the Timer interrupt should not schedule some
>> other task since that might leave kernel data in an inconsistent state -
>> kernel didn't finish doing whatever it was doing when interrupted.
>>
>
> at the microscopic level, every stream of assembly instructions can always
> be broken up and intercepted by interrupt, and possibly switched into
> another stream of assembly instruction or logic, the maintenance of state
> "consistency" is done via context switching.
>

context switching is done at software level, and i am not if there is
difference between process context switch or thread/task level context
switching, but hardware only guarantee register context switch - and not
sure if it covers all the floating point (SSE) registers too (unlikely,
performance overheads)so.."consistency" is really how you write your
software.   and u also have multiple switching (by different CPU) all
taking place independently all the time, writing into the same piece of RAM.

I also know that nvidia GPU does not clean up its memory/state when
switching from one process to another process, but that is beyond the
control of hardware switching logic of the CPU anyway.

>
>> So, does the Timer interrupt handler include such a policy for the above
>> case?
>>
>> --
>> Gaurav Jain
>>
>>
>>
>>
>> _______
>> Kernelnewbies mailing list
>> Kernelnewbies@kernelnewbies.org
>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>
>>
>
>
> --
> Regards,
> Peter Teoh

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Kernel code interrupted by Timer

2013-02-08 Thread Peter Teoh

On Sat, Feb 9, 2013 at 1:08 AM, Gaurav Jain  wrote:

> What happens if the kernel executing in some process context (let's say
> executing a time-consuming syscall) gets interrupted by the Timer - which
> is apparently allowed in 2.6 onwards kernels.
>
> My understanding is that once the interrupt handler is done executing, we
> should switch back to where the kernel code was executing. Specifically,
> the interrupt handler for the Timer interrupt should not schedule some
> other task since that might leave kernel data in an inconsistent state -
> kernel didn't finish doing whatever it was doing when interrupted.
>

at the microscopic level, every stream of assembly instructions can always
be broken up and intercepted by interrupt, and possibly switched into
another stream of assembly instruction or logic, the maintenance of state
"consistency" is done via context switching.


> So, does the Timer interrupt handler include such a policy for the above
> case?
>
> --
> Gaurav Jain
>
>
>
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: hd controller

2013-02-07 Thread Peter Teoh

good sharing.   following up on your comments:

in the kernel source:

block/*.c are the files for block I/O related stuff - the layer just before
ATA, implementing stuff like elevator I/O etc.
drivers/block/*.c:  hardware-specific files that understand how to talk to
each type of harddisk.
drivers/scsi/*.c:   generally SCSI protocol related stuff (lib*.c), but may
contain device specific stuff.
drivers/ide/*.c:
drivers/ata/*.c:   among the lowest level just before sending out port I/O
operation.

On Fri, Feb 8, 2013 at 8:26 AM,  wrote:

> On Fri, 08 Feb 2013 07:48:39 +0800, Peter Teoh said:
>
> > So the drivers just literally concatenate these command into a string and
> > send it over to the device.
>
> The reason that good disk drivers are hard to write is because it isn't
> *just* literally concatenating the commands - it also has to do memory
> management (make sure that everybody's data ends up in the right buffers),
> command queue management, elevator management (if there's multiple I/O
> requests pending from userspace, what order do we issue them in?), error
> recovery, power management, and a ton of other stuff...
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: hd controller

2013-02-07 Thread Peter Teoh

at the lowest level, SCSI/IDE/SATA all shared a common command base
(perhaps with variations) - which is ATA command (because in
drivers/ata/*.c u can find the symbol ATA_XXX_CMD in all the three
different hardware architecture):

Below is a an example specified by standard body (these command are OS
agnostic):

https://github.com/gcastigl/SO2C2011TP2/blob/master/doc/ATA%20-%20ATAPI%20Command%20Set.pdf

Look at all the "ATA_CMD_*" command here:

https://github.com/Scorpiion/Renux_u-boot/blob/master/include/ata.h

So the drivers just literally concatenate these command into a string and
send it over to the device.

for example in drivers/ata/libata-core.c:

static int ata_read_native_max_address(struct ata_device *dev, u64
*max_sectors)
{
unsigned int err_mask;
struct ata_taskfile tf;
int lba48 = ata_id_has_lba48(dev->id);

ata_tf_init(dev, &tf);

/* always clear all address registers */
tf.flags |= ATA_TFLAG_DEVICE | ATA_TFLAG_ISADDR;

if (lba48) {
tf.command = ATA_CMD_READ_NATIVE_MAX_EXT;
tf.flags |= ATA_TFLAG_LBA48;
} else
tf.command = ATA_CMD_READ_NATIVE_MAX;

the tf.command data within is ultimately send by port I/O operation.
BUT.not sure of details, corrections welcome :-).

On Thu, Feb 7, 2013 at 4:19 PM, horseriver  wrote:

> hi:)
>
>I am curious about how hd controller work .
>When user am reaing/writing hd ,it was implemented by sending command
>to hd controller's special port.Then ,how does the controller know
>a new command has received?
>
>In this procedure , what work does the hd driver do ?
>
> thanks!
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: thread concurrent file operation

2013-02-07 Thread Peter Teoh

To generalize further u can safely say that all synchronous operation have
to be thread-safe, except for some APIs as listed here:

http://pubs.opengroup.org/onlinepubs/007904975/functions/xsh_chap02_09.html

linux kernel may guarantee thread-safety - but this only apply to
serializing data at the per-syscall level.   Ie, every read() will
complete, before being intercepted by another read() from another thread.
But at the file level u still may get file corruption/file datastructure
mangled if u mixed write/read without properly serialization at the
userspace level.   thus, kernel locking + userspace locking are needed -
for different purpose.

below discussion is useful (first answer esp):

http://stackoverflow.com/questions/5268307/thread-safety-of-read-pread-system-calls

in the kernel for each file descriptor, there is only one single offset
value to indicate the current file pointer position.   so at the userspace
level, different read/write combination will affect the file pointer value
- which explained also why userspace locking (for logical reasons) are
needed.

On Thu, Feb 7, 2013 at 6:23 PM, Peter Teoh  wrote:

> Multiple concurrent write() by different thread is possible, as they all
> can share the same file descriptor in a single similar process, and this is
> not allowed.   So nevertheless, the problem you posed is not
> allowed/acceptable by the kernel, so Linus himself fixed it:
>
> See here:
>
> http://lwn.net/Articles/180387/
>
> And Linus patch:
>
> http://lwn.net/Articles/180396/
>
> but my present version (3.2.0) has rcu lock over it (higher performance):
>
> INIT_LIST_HEAD(&f->f_u.fu_list);
> atomic_long_set(&f->f_count, 1);
> rwlock_init(&f->f_owner.lock);
> spin_lock_init(&f->f_lock);
> eventpoll_init_file(f);
> /* f->f_version: 0 */
>
>
>  On Thu, Feb 7, 2013 at 4:44 PM, Karaoui mohamed lamine <
> mohar...@gmail.com> wrote:
>
>>
>> Tahnks guys!
>>
>> 2013/1/30 Karaoui mohamed lamine 
>>
>>> thanks, i think i get it.
>>>
>>> 2013/1/30 
>>>
>>> On Tue, 29 Jan 2013 20:16:26 +0100, you said:
>>>>
>>>> > Actually my question is :
>>>> > Does POSIX specifies  the fact that we need to use "lockf" to be able
>>>> to do
>>>> > read/write operation in different offset ? Is'n the kernel supposed to
>>>> > ensure this ?
>>>>
>>>> If you have non-overlapping writes, the kernel will eventually sort it
>>>> out
>>>> for you.  If your writes overlap, you'll have to provide your own
>>>> locking
>>>> via lockf() or similar, and synchronization via other methods.
>>>>
>>>
>>>
>>
>> ___
>> Kernelnewbies mailing list
>> Kernelnewbies@kernelnewbies.org
>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>
>>
>
>
> --
> Regards,
> Peter Teoh
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: thread concurrent file operation

2013-02-07 Thread Peter Teoh

Multiple concurrent write() by different thread is possible, as they all
can share the same file descriptor in a single similar process, and this is
not allowed.   So nevertheless, the problem you posed is not
allowed/acceptable by the kernel, so Linus himself fixed it:

See here:

http://lwn.net/Articles/180387/

And Linus patch:

http://lwn.net/Articles/180396/

but my present version (3.2.0) has rcu lock over it (higher performance):

INIT_LIST_HEAD(&f->f_u.fu_list);
atomic_long_set(&f->f_count, 1);
rwlock_init(&f->f_owner.lock);
spin_lock_init(&f->f_lock);
eventpoll_init_file(f);
/* f->f_version: 0 */


On Thu, Feb 7, 2013 at 4:44 PM, Karaoui mohamed lamine
wrote:

>
> Tahnks guys!
>
> 2013/1/30 Karaoui mohamed lamine 
>
>> thanks, i think i get it.
>>
>> 2013/1/30 
>>
>> On Tue, 29 Jan 2013 20:16:26 +0100, you said:
>>>
>>> > Actually my question is :
>>> > Does POSIX specifies  the fact that we need to use "lockf" to be able
>>> to do
>>> > read/write operation in different offset ? Is'n the kernel supposed to
>>> > ensure this ?
>>>
>>> If you have non-overlapping writes, the kernel will eventually sort it
>>> out
>>> for you.  If your writes overlap, you'll have to provide your own locking
>>> via lockf() or similar, and synchronization via other methods.
>>>
>>
>>
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Creating scheduler

2013-02-06 Thread Peter Teoh

well...u asked for it:

http://abstract.cs.washington.edu/~shwetak/classes/ee472/assignments/lab2/lab2.pdf

http://www.cs.cmu.edu/~410-s07/p3/kernel.pdf

http://web.stonehill.edu/compsci/CS314/Assignments/Assignment0.pdf

http://www.cs.amherst.edu/~sfkaplan/courses/2012/spring/cs261/assignments/project-1.pdf

etc...googling returned me 27000 links

On Thu, Feb 7, 2013 at 1:49 AM, jeshkumar...@gmail.com <
jeshkumar...@gmail.com> wrote:

> Hi all :),
>
> Can anyone suggest a good tutorial to create our own scheduler ?
>
>
> Sent from my HTC
> Excuse for typo.
>
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: thread concurrent file operation

2013-02-06 Thread Peter Teoh

in ANY updates/changes, locking is always needed, to prevent multiple
parties from updating at the same time.   but there is another way:
 lockless updates.   one form done in linux kernel is called RCU:

http://en.wikipedia.org/wiki/Read-copy-update

the logic is whenever someone want to change, just write the changes
somewhere, so that reconstruction of the change is possible through reading
the changes + existing data.   (Oracle database, and indeed any database
does that too.).   so if multiple CPU want to write to the same place, then
u still need per-CPU locks for classic RCU:

http://lwn.net/Articles/305782/

But for reader, there is no need to lock:  just go ahead and read - if u
read AFTER the update has started, then u will be reading the older copy,
and the last reader will then kick off the merging of the older copy +
newer updates.

http://lwn.net/2001/features/OLS/pdf/pdf/read-copy.pdf

http://lwn.net/Articles/262464/

http://lwn.net/Articles/263130/  (see the picture here)

but these locking are done at the low level - harddisk is data block level.

For vfs_read() -  its purpose is to read...and it does not prevent u from
writing!!! yes, everything is left to the user at the userspace
level...locking/unlocking.   because it is done at the FILE level, and so
if u have multiple reads and then someone come in and writeyes, there
will be corruption.   but that is the logic corruption, not the
hardware/datablocks corruption, which the kernel aimed to protect.

On Tue, Jan 29, 2013 at 11:35 PM, Karaoui mohamed lamine  wrote:

> Hello,
>
> I was looking at how a syscall read/write was done, and i found this :
>
>
>loff_t pos = file_pos_read(f.file);
>ret = vfs_read(f.file, buf, count, &pos);
>file_pos_write(f.file, pos);
>fdput(f);
>...
>
> My questions are :
>
> Where did the locking go? I would have imaginated something like :
>
>
>*lock(f);*
>loff_t pos = file_pos_read(f.file);
>ret = vfs_read(f.file, buf, count, &pos);
>file_pos_write(f.file, pos);
>fdput(f);
>*unlock(f);*
>...
>
> If multiple threads try to read/write at the same time, they could
> read/write at the same offset ?
>
> If my understanding are correct, is this POSIX compliant ?
>
>
> thanks.
>
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: hard disk dirver

2013-02-06 Thread Peter Teoh

On Wed, Feb 6, 2013 at 1:21 PM, horseriver  wrote:

> hi:)
>
>I have a newbie question about hard ware.
>At booting stage,kernel need to detect the hard device before mount it,
>does this work  need pci's surport?
>

>At loading stage ,boot loader need to move binaries from hard disk
> partition
>to ram,does this work need pci's surport?
>

hard disk I/O is in ATA bus, and PCI has it own bus on the chipset (see
page 69):

http://downloadmirror.intel.com/19123/eng/d525mw_d525mwv_techprodspec.pdf

and page 14:

http://download.intel.com/support/motherboards/desktop/d865gsa/sb/d5600601us.pdf

But these are terminologies.   At the source code level, (and tools as
well), PCI and ATA are not differentiated much:

in drivers/ata/ata_piix.c, and in drivers/pci/quirks.c both directory u can
see 82801 symbols exists.

For your problem i think it is a BOCHS problem...mixing with recent linux
kernel (older kernel should be fine)...eg,

http://forums.gentoo.org/viewtopic-t-915210-view-previous.html?sid=a003ebbc022d7f23399fc7f1c5dad424

(notice the 3.2 kernel) which is resolved via setting the PCI configuration
in BOCHS as well.   take a look.

> thanks!
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: When does the /dev/sda1 node comes into being ?

2013-02-05 Thread Peter Teoh

i think it depends.   some are softlinks in /dev/ some are created by udev
after udevd read the configuration file, many scenario involved (just
search for "util_create_path" inside udev source codes and u can what are
all the situation).

but for harddisk (whose partition is also the rootfs) /dev/sda is created
during kernel booting up (inside the initrd file, just gunzip and extract
out the cpio file, eg, view the file scripts/local and u can see it make
the /dev/sd nodes based on /sys/block/ information, which in turn
depends on the kernel calling _device_register() functions (there a few
variations of them - organized hierarchically)).

on the other hand, if /dev/sda is not the rootfs, but just a normal
harddisk listed in /etc/fstab, then likely it is mounted by udev, detecting
it, and then calling (indirectly from userspace to kernel)
sd_probe_async(), which will then printk() out the "Write Protect is off"
message in your dmesg output - anytime u plug in the harddisk u can see
this.

On Wed, Feb 6, 2013 at 1:26 AM, horseriver  wrote:

> hi:)
>
>   During booting period .every device will have a node at /dev/ folder.
>   what is the detail of ths procedure?
>
> thanks!
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: How to analyze kernel Oops dump

2013-02-05 Thread Peter Teoh

perhaps let me try:

The cause of crash is here:

[  493.113464] Unable to handle kernel paging request at virtual address
f6b9f777
[  493.124298] pgd = ec4c4000
[  493.127166] [f6b9f777] *pgd=

ie, value of page directory at 0xec4c4000 is zero.

at the time of crash the set of register values are:

[  493.169158] PC is at __kmalloc_track_caller+0xa4/0x1ec
[  493.174591] LR is at 0x80569dc0
[  493.177917] pc : [<801094d8>]lr : [<80569dc0>]psr: a113
[  493.177947] sp : 80569dc0  ip : 89011b70  fp : 80569dfc
[  493.190124] r10: 1fea  r9 : 0001  r8 : 
[  493.195648] r7 : 0940  r6 : 00d1  r5 : ed002900  r4 : f6b9f777
[  493.202575] r3 : 80568000  r2 :   r1 : 08aa8000  r0 : 80589c00
[  493.209503] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM Segment
kernel
[  493.217254] Control: 10c5387d  Table: ec4c406a  DAC: 0015

Take the same version of the kernel source, and u can see that line 3415
matches exactly the warning message in the error log:

size_t ksize(const void *object)
{
struct page *page;

if (unlikely(object == ZERO_SIZE_PTR))
return 0;

page = virt_to_head_page(object);

if (unlikely(!PageSlab(page))) {
WARN_ON(!PageCompound(page));  => this is line 3415
return PAGE_SIZE << compound_order(page);
}

return slab_ksize(page->slab);
}
EXPORT_SYMBOL(ksize); ==> exported symbols results in the kernel image
having "ksize" as the symbol near the crash point - which is located +0x70
from "ksize".

As for the reason the page's compound page attributes has not been set
correctly.u have to read the history:

[  494.068664] Backtrace:
[  494.071289] [<80109434>] (__kmalloc_track_caller+0x0/0x1ec) from
[<80335ec0>] (__alloc_skb+0x60/0xfc)
[  494.081085] [<80335e60>] (__alloc_skb+0x0/0xfc) from [<80336530>]
(__netdev_alloc_skb+0x2c/0x54)
[  494.090423] [<80336504>] (__netdev_alloc_skb+0x0/0x54) from [<7f078788>]
(stmmac_poll+0x590/0x794 [stmmac])
[  494.100738]  r4:ed0b84c0 r3:
[  494.104553] [<7f0781f8>] (stmmac_poll+0x0/0x794 [stmmac]) from
[<8033f23c>] (net_rx_action+0x88/0x1f0)
[  494.114440] [<8033f1b4>] (net_rx_action+0x0/0x1f0) from [<80045fb4>]
(__do_softirq+0x12c/0x260)
[  494.123657] [<80045e88>] (__do_softirq+0x0/0x260) from [<8004659c>]
(irq_exit+0x58/0xb0)
[  494.132263] [<80046544>] (irq_exit+0x0/0xb0) from [<8000fa08>]
(handle_IRQ+0x8c/0xc8)
[  494.140563]  r4:0078 r3:020c
[  494.144378] [<8000f97c>] (handle_IRQ+0x0/0xc8) from [<80008658>]
(gic_handle_irq+0x48/0x6c)
[  494.153228]  r5:80569f40 r4:fa212000
[  494.157043] [<80008610>] (gic_handle_irq+0x0/0x6c) from [<8000e600>]
(__irq_svc+0x40/0x70)
[  494.165802] Exception stack(0x80569f40 to 0x80569f88)

>From the above, I can only guess the possible calling sequence are as below:

In net/core/skbuff.c:

 170 struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask,
 171 int fclone, int node)
 172 {
xx
 200 size = SKB_WITH_OVERHEAD(ksize(data));
 201 prefetchw(data + size);
 202

notice the _alloc_skb()==>ksize(), which ended up with *pgd error above?

looked also a few functions below stmmac_poll() (as the offset 0x590 is
quite far away from stmmac_poll(), so it is unlikely to be this function
itself, as other subsequent function after this is declared with "static",
meaning that it does not have symbol, so disassembly-wise will still use
the "stmmac_poll" symbol.   Seemed like descriptor related bug.

See this:

http://comments.gmane.org/gmane.linux.network/236183

whose version comes after 3.4.0, or 3.4.6 - to be specific:

http://lwn.net/Articles/507526/

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: How the follow Starts in Android-Kernel

2013-02-05 Thread Peter Teoh

http://duartes.org/gustavo/blog/post/how-computers-boot-up

this is for x86, not for ARM though.

On Wed, Feb 6, 2013 at 10:30 AM, Peter Teoh  wrote:

>
> normally in embedded uboot is the bootloader.   and to trace this is
> simple:
>
> a.   understand how uboot works - and this is highly platform specific
> (uboot is highly hardware dependent)...and examine the point where control
> passed is passed to kernel image file (which still run at 16 bit real
> mode), and from there u can trace everything.
>
> b.   well u need assembly, as everything starting is written in assembly.
>   for ARM (as u asked for Android), the place is "start_kernel" inside:
>
> arch/arm/kernel/head-common.S
>
> and then u must learn linker scripting (for ARM is
> arm/kernel/vmlinux.ld.S) as well, that is how u tell the compiler to
> generate a image that can be loaded directly into memory and executed
> directly on the hardware in memory - using the hardware-specific reset
> vector as the starting point.   there is no loader at this stage to load
> the binary.   (uboot will load it as a image, but executeable).
>
> the rest is yours...
>
> On Mon, Feb 4, 2013 at 12:34 PM, Ranganath T.M wrote:
>
>> Hi All,
>>
>> I am trying to find out how the kernel will *start* from the uboot and
>> how the kernel will call there respective static modules which are built as
>> *.o* file and also how the *probe* of every modules will be called.
>>
>> Thanks And Regards
>> Ranganath
>>
>> ___
>> Kernelnewbies mailing list
>> Kernelnewbies@kernelnewbies.org
>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>
>>
>
>
> --
> Regards,
> Peter Teoh
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: How the follow Starts in Android-Kernel

2013-02-05 Thread Peter Teoh

normally in embedded uboot is the bootloader.   and to trace this is simple:

a.   understand how uboot works - and this is highly platform specific
(uboot is highly hardware dependent)...and examine the point where control
passed is passed to kernel image file (which still run at 16 bit real
mode), and from there u can trace everything.

b.   well u need assembly, as everything starting is written in assembly.
for ARM (as u asked for Android), the place is "start_kernel" inside:

arch/arm/kernel/head-common.S

and then u must learn linker scripting (for ARM is arm/kernel/vmlinux.ld.S)
as well, that is how u tell the compiler to generate a image that can be
loaded directly into memory and executed directly on the hardware in memory
- using the hardware-specific reset vector as the starting point.   there
is no loader at this stage to load the binary.   (uboot will load it as a
image, but executeable).

the rest is yours...

On Mon, Feb 4, 2013 at 12:34 PM, Ranganath T.M wrote:

> Hi All,
>
> I am trying to find out how the kernel will *start* from the uboot and
> how the kernel will call there respective static modules which are built as
> *.o* file and also how the *probe* of every modules will be called.
>
> Thanks And Regards
> Ranganath
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Linux Kernel Networking document (free, 178 pages doc)

2013-02-03 Thread Peter Teoh

http://www.haifux.org/lectures.html

This link has even more lectures.

On Mon, Feb 4, 2013 at 11:55 AM, Peter Teoh  wrote:

> Good sharing and info.   I thought it is also useful to share your
> lectures materials at:
>
> http://www.haifux.org/rami_rosen.html
>
> which I must highlight has lots of work done since 2007.   Keep up the
> good work!!
>
>
> On Tue, Jan 29, 2013 at 12:53 AM, Rami Rosen  wrote:
>
>> Hi everyone,
>> You can find here an up to date and detailed document in pdf (178
>> pages) about Linux Kernel Networking; going deep into design and
>> implementation details as well as the theory behind it:
>> http://media.wix.com/ugd//295986_931b8bcf34d93419d46e05b5aa5d0216.pdf
>>
>> I believe that developers/sysadmins/researchers/students may find help
>> with it.
>>
>>
>> regards,
>> Rami Rosen
>>
>> http://ramirose.wix.com/ramirosen
>>
>> ___
>> Kernelnewbies mailing list
>> Kernelnewbies@kernelnewbies.org
>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>
>
>
>
> --
> Regards,
> Peter Teoh
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Linux Kernel Networking document (free, 178 pages doc)

2013-02-03 Thread Peter Teoh

generally, anything u write for ext2, should still be valid for ext3, and
ext4. in the sense that the features are backward compatible.   sizing
limits may have increased, but OLD working mechanism should still be
validexcept for some.

so ext2 fs should still be mountable as ext4, but not vice versa, once some
flag is enabled (I think it is xattr).  and if the flag is not enabled, and
the journal logs is clean, then ext4 fs is also mountable as ext2 fs:

http://superuser.com/questions/408822/ext4-converted-mounted-as-ext2

http://computer-forensics.sans.org/blog/2011/06/14/digital-forensics-mounting-dirty-ext4-filesystems

http://en.wikipedia.org/wiki/Extended_file_attributes

On Sun, Feb 3, 2013 at 12:26 AM, Rami Rosen  wrote:

> Hi,
> > ext2 and ext3 are kind of obsolete now.
>
> Indeed, ext4 was integrated into Linux kernel back in 2008.
> Amongs its known features which do not exist in ext3 are support for
> huge files (like   1 EB (exabyte or somtimes termed exbibyte); 1 EB is
>  1024 PB (petabyte) whereas
> 1 PB is  1024 TB (terabyte).
> a directory can contain a maximum of 64,000 subdirectories (whereas we
> have 32,000 in ext3)
> Amongst its other features are Journal checksumming, Multiblock
> allocator, Faster file system checking and more.
>
>
> If you prefer to start with simpler implementations, ext3 is of course
> simpler, and of course ext2 is even simpler than ext3.
>
> But in case you intend to start with ext2/ext3, and later perform
> a pass on all your documentation to update it to ext4, take into
> consideration that this will take quite a time; depending on how deep
> you intend to delve into implementation details.
>
> Good luck!
>
> Regards,
> Rami Rosen
> http://ramirose.wix.com/ramirosen
>
>
>
> On Sat, Feb 2, 2013 at 11:43 AM, Shubham Sharma
>  wrote:
> > Hi,
> >
> > I understand that ext2 and ext3 are kind of obsolete now. But AFAIK,
> there
> > is not much difference in ext3 and ext4.
> >
> > Moreover for a newbie , it is better to start with ext3. What you think ?
> >
> > Regards
> > Shubham
> >
> >
> > On Fri, Feb 1, 2013 at 2:15 AM, Rami Rosen  wrote:
> >>
> >> Hi,
> >> Have you considered to start with ext4?
> >> it seems that ext3, ext2 are a bit out of fashion,
> >>
> >> Regards,
> >> Rami Rosen
> >> http://ramirose.wix.com/ramirosen
> >>
> >>
> >> On Thu, Jan 31, 2013 at 8:58 PM, shubham 
> wrote:
> >> > Thanks Rami,
> >> >
> >> > I am also trying to understand ext3 and write some document for the
> >> > same.
> >> >
> >> > Regards
> >> > Shubham
> >> >
> >> >
> >> > On 31-Jan-13 12:51 AM, Rami Rosen wrote:
> >> >>
> >> >> HI,
> >> >> I will try to write something for Linux Filesystems  (and maybe for
> >> >> other subsystems) but this will probably take a lot of time.
> >> >>
> >> >> Regards,
> >> >> Rami Rosen
> >> >> http://ramirose.wix.com/ramirosen
> >> >>
> >> >>
> >> >> On Wed, Jan 30, 2013 at 5:44 PM, shubham 
> >> >> wrote:
> >> >>>
> >> >>> Thanks for sharing the document.
> >> >>>
> >> >>> I hope we could have such documents for other subsystems as well.
> >> >>>
> >> >>> Regards
> >> >>> Shubham
> >> >>>
> >> >>>
> >> >>> On 28-Jan-13 10:23 PM, Rami Rosen wrote:
> >> >>>>
> >> >>>> Hi everyone,
> >> >>>> You can find here an up to date and detailed document in pdf (178
> >> >>>> pages) about Linux Kernel Networking; going deep into design and
> >> >>>> implementation details as well as the theory behind it:
> >> >>>>
> http://media.wix.com/ugd//295986_931b8bcf34d93419d46e05b5aa5d0216.pdf
> >> >>>>
> >> >>>> I believe that developers/sysadmins/researchers/students may find
> >> >>>> help
> >> >>>> with it.
> >> >>>>
> >> >>>>
> >> >>>> regards,
> >> >>>> Rami Rosen
> >> >>>>
> >> >>>> http://ramirose.wix.com/ramirosen
> >> >>>>
> >> >>>> ___
> >> >>>> Kernelnewbies mailing list
> >> >>>> Kernelnewbies@kernelnewbies.org
> >> >>>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
> >> >>>
> >> >>>
> >> >
> >
> >
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Linux Kernel Networking document (free, 178 pages doc)

2013-02-03 Thread Peter Teoh

Good sharing and info.   I thought it is also useful to share your lectures
materials at:

http://www.haifux.org/rami_rosen.html

which I must highlight has lots of work done since 2007.   Keep up the good
work!!

On Tue, Jan 29, 2013 at 12:53 AM, Rami Rosen  wrote:

> Hi everyone,
> You can find here an up to date and detailed document in pdf (178
> pages) about Linux Kernel Networking; going deep into design and
> implementation details as well as the theory behind it:
> http://media.wix.com/ugd//295986_931b8bcf34d93419d46e05b5aa5d0216.pdf
>
> I believe that developers/sysadmins/researchers/students may find help
> with it.
>
>
> regards,
> Rami Rosen
>
> http://ramirose.wix.com/ramirosen
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: How to wake_up the wait_queue of a socket?

2013-01-18 Thread Peter Teoh

On Sat, Jan 19, 2013 at 1:36 AM, horseriver  wrote:

> On Fri, Jan 18, 2013 at 10:18:19AM +0800, Peter Teoh wrote:
> > essentially, when the packet arrive, it will be assigned to the correct
> > process based on IP address + port matching, and then the corresponding
> > process's blocked scheduling status will be changed to continue
> execution,
> > so that when the scheduler next selection of runnable process will pick
> him
> > out for continue execution.   The process will then pick his data up from
> > the network queue.
> >
>
>   Thanks!
>
>   If there is no event occured on one socket descriptor  ,
>   will the poll operation on this socket descriptor be blocked ?
>

I/O mechanism have two types:  blocking and non-blocking.   by definition:
poll is non-blocking, and select() is blocking.  In general that is true
for kernel source as well.

For details and implementations there may be ambiguity.

For eg, manpage say poll may has a timeout for blocking, and inside the
kernel source:

in fs/select.c's definition for select() syscall:

SYSCALL_DEFINE5(select, int, n, fd_set __user *, inp, fd_set __user *, outp,
fd_set __user *, exp, struct timeval __user *, tvp)
{
struct timespec end_time, *to = NULL;
struct timeval tv;
int ret;

if (tvp) {
if (copy_from_user(&tv, tvp, sizeof(tv)))
return -EFAULT;

to = &end_time;
if (poll_select_set_timeout(to,
tv.tv_sec + (tv.tv_usec / USEC_PER_SEC),
(tv.tv_usec % USEC_PER_SEC) *
NSEC_PER_USEC))
return -EINVAL;
}

ret = core_sys_select(n, inp, outp, exp, to);
ret = poll_select_copy_remaining(&end_time, tvp, 1, ret);


And for syscall of poll() (same file):

SYSCALL_DEFINE3(poll, struct pollfd __user *, ufds, unsigned int, nfds,
long, timeout_msecs)
{
struct timespec end_time, *to = NULL;
int ret;

if (timeout_msecs >= 0) {
to = &end_time;
poll_select_set_timeout(to, timeout_msecs / MSEC_PER_SEC,
NSEC_PER_MSEC * (timeout_msecs % MSEC_PER_SEC));
}

So there is this common file poll_select_set_timeout() called by
boththe details is even more confusing - shall stop here.

A good article on epoll etc:

http://www.eecs.berkeley.edu/~sangjin/2012/12/21/epoll-vs-kqueue.html



> > > ___
> > > Kernelnewbies mailing list
> > > Kernelnewbies@kernelnewbies.org
> > > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
> > >
> >
> >
> >
> > --
> > Regards,
> > Peter Teoh
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: no error thrown with exit(0) in the child process of vfork()

2013-01-18 Thread Peter Teoh

On Sat, Jan 19, 2013 at 10:43 AM, Peter Teoh wrote:

>
>
> On Sat, Jan 19, 2013 at 5:49 AM,  wrote:
>
>> On Fri, 18 Jan 2013 19:59:38 +0530, Niroj Pokhrel said:
>>
>> > I have been trying to create a process using vfork(). And both of the
>> child
>> > and the parent process execute it in the same address space. So, if I
>> > execute exit(0) in the child process, it should throw some error right.
>>
>> Why do you think it should throw an error?
>>
>> > Since the execution is happening in child process first and if I release
>> > all the resources by using exit(0) in the child process then parent
>> should
>> > be deprived of the resources and should throw some errors right ??
>>
>> No, because those resources that were shared across a fork() or vfork()
>> were in
>> general *multiple references* to the same resource.
>>
>>
> Yes, correct, Valdis is right.   Normally, when u free resources (which is
> what "exit()" will do), u must also remember to check something call
> "reference count".
>
> Basic malloc() and free() memory management internal data structure also
> comes with other info like size (which is 4 bytes BEHIND the first byte
> where the pointer points to, and other info).   More info:
>
> http://stackoverflow.com/questions/1957099/how-do-free-and-malloc-work-in-c
>
> but what is lacking is reference counting.   But this feature is available
> in Java and C++ libraries for memory allocation.
>
> Concept discussed here:
>
> http://stoneship.org/essays/c-reference-counting-and-you/
>
> Interesting
>
>
>> As an example - imagine a flagpole.  You grab it with your hand, you're
>> now holding it.  You invite your friend to come over and grab it with
>> his hand - now he's holding it too.
>>
>> But either one of you can let go of the flagpole - and the other one is
>> still holding the flagpole until *they* let go.  And the order you let
>> go doesn't matter in this case - which is important because your example
>> code has a race condition
>>
>> Note that there are other cases where the order people let go *does*
>> matter.
>> This is when you start having to worry about "locking order" and things
>> like
>> that.
>>
>> > In the following code, however the process ran fine even though I have
>> > exit(0) in the child process 
>>
>> > #include
>> > #include
>> > #include
>> > #include
>> > int main()
>> > {
>> > int val,i=0;
>> > val=vfork();
>> > if(val==0)
>> > {
>> > printf("\nI am a child process.\n");
>>
>> Note that printf() gets interesting due to stdio buffering.  You probably
>> want to call setbuf() and guarantee line-buffering of the output if you're
>> playing these sorts of games - the buffering can totally mask a real race
>> condition or other bug.
>>
>> > printf(" %d ",i++);
>> > exit(0);
>> > }
>> > else
>> > {
>>
>> /* race condition here - may want wait() or waitpid() to synchronize? */
>>
>> > printf("\nI am a parent process.\n");
>> > printf(" %d ",i);
>> > }
>> > return 0;
>> > }
>> > // The program is running fine .
>> > But as I have read it should throw some error right ?? I don't know
>> what I
>> > am missing . Please point out the point I'm missing. Thanking you in
>> > advance.
>>
>> You're also missing the fact that after the vfork(), there's no real
>> guarantee of which will run first - which means that the parent can race
>> and output the 'printf("%d",i)" *before* the child process gets a chance
>> to do the i++.
>>
>>
> I don't think there is any issue here (racing, or child calling exit
> before parent called exit()).   Read the man-page:
>
>vfork()  differs from fork(2) in that the parent is suspended until
> the
>child terminates (either normally, by calling _exit(2), or
>  abnormally,
>after  delivery  of  a  fatal signal), or it makes a call to
> execve(2).
>Until that point, the child shares all memory with its parent,
>  includ‐
>ing  the stack.  The child must not return from the current
> function or
>call exit(3), but may call _exit(2).
>
>
> So:
>
> 1.   if parent is suspended, it also means

Re: no error thrown with exit(0) in the child process of vfork()

2013-01-18 Thread Peter Teoh

d that the
> child would run first, on the theory that the child would often do
> something
> short that the parent was waiting on, so scheduling parent-first would just
> result in the parent running, blocking to wait, and we end up running the
> child anyhow before the parent could continue.  It broke an *amazing*
> amount
> of stuff in userspace because often the child would exit() before the
> parent was
> ready to deal with the child process's termination. Usual failure mode was
> the parent would set a SIGCHLD handler, and wait for the signal which never
> happened because the SIGCHLD actually fired *before* the handler was set
> up).
>
> (And on non-cache-coherent systems, it's even possible that the i++ happens
> on a different CPU first, and the CPU running the parent process never
> becomes
> aware of it.  See 'Documentation/memory-barriers.txt' in the Linux source
> for more info on how this works for data inside the kernel.  This example
> is out in userspace, so other techniques are required instead to do
> cross-CPU
> synchronization.
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: How to wake_up the wait_queue of a socket?

2013-01-17 Thread Peter Teoh

essentially, when the packet arrive, it will be assigned to the correct
process based on IP address + port matching, and then the corresponding
process's blocked scheduling status will be changed to continue execution,
so that when the scheduler next selection of runnable process will pick him
out for continue execution.   The process will then pick his data up from
the network queue.

hope I have not made any mistake in my logic?

On Tue, Jan 15, 2013 at 8:36 AM, horseriver  wrote:

> On Tue, Jan 15, 2013 at 12:25:10PM -0500, valdis.kletni...@vt.edu wrote:
> > On Mon, 14 Jan 2013 17:50:03 +0800, horseriver said:
> >
> > >When one datagram has reached , How to wake_up the wait_queue of
> that socket ?
> >
> > Please clarify your question - I'm not sure which of the following you
> mean:
> >
>  1) How does the kernel wake up the waiting process when a datagram
>  arrives?
>
>   This is my mean !
>
>   Thanks
>
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: working of fork and exec

2013-01-17 Thread Peter Teoh

Hi Mulyadi,

Great to see you again!

Sorry, can I fork on your explanation to explain further about fork?

Yes, "fork" is at the core of process management, scheduling and all that:

http://www.ibm.com/developerworks/linux/library/l-linux-process-management/

a good picture of process splitting up (forking) is here:

http://www.linux-tutorial.info/modules.php?name=MContent&pageid=83

what happened to all the IPC after forking?

http://hzqtc.github.com/2012/07/linux-ipc-with-pipes.html

http://static.usenix.org/event/usenix2000/general/reumann/reumann_html/node9.html

Generally, the last thing u should read is the kernel source code, though
it also has the last word to be said for fork() :-).

On Fri, Jan 18, 2013 at 1:59 AM, Mulyadi Santosa
wrote:

> Hi :)
>
> On Fri, Jan 18, 2013 at 12:02 AM, Niroj Pokhrel 
> wrote:
> > Hi all,
> > I have been using fork and exec for sometime. But I have no idea about
> what
> > are the things done by the kernel when we fork or exec and how things
> work.
> > How the kernel load new program and what all things are done ... Can
> > anybody please explain me this ? Thank you in advance.
>
> this is too broad to answer, but in general fork() does:
> - preparing new address space
> - preparing new task_struct
> - doing COW (copy on write), so newly born child initially simply use
> parent's pages
>
> in exec() case, instead of COW, you load the target binary. It does so
> by the work of loader in user space and ELF interpreter in the kernel
> space.
>
> --
> regards,
>
> Mulyadi Santosa
> Freelance Linux trainer and consultant
>
> blog: the-hydra.blogspot.com
> training: mulyaditraining.blogspot.com
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: what is the function of tcp_prequeue ?

2013-01-17 Thread Peter Teoh

On Wed, Jan 16, 2013 at 2:50 PM, horseriver  wrote:

> hi:
>
>   what is the function of tcp_prequeue ?
>
>
Basically there are 3 types of TCP queuing (use google translate if u need
non-Chinese):

http://www.360doc.com/content/09/0518/15/36491_3551831.shtml

See here for a detailed overview:

http://e-university.wisdomjobs.com/linux/chapter-189-277/sending-the-data-from-the-socket-through-udp-and-tcp.html



> thanks!
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: what is the difference between poll and epoll ?

2013-01-14 Thread Peter Teoh

to wait on a particular event based on the file descriptor.

http://stackoverflow.com/questions/9167752/how-does-the-poll-function-work-in-c

http://linux.die.net/man/3/poll

look at the example if u don't understand the man page:

http://linux.byexamples.com/archives/133/write-a-function/

and this comes with explanation + example:

http://www.linux-mag.com/id/357/


On Mon, Jan 14, 2013 at 4:20 AM, horseriver  wrote:

> hi:
>
>   what is the function of a file's poll function ?
>
>
> thanks!
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: to implement cntl+c from shell script on minicom

2013-01-12 Thread Peter Teoh

Sorry I have not done any runscript programming before, but reading this:


http://lists.alioth.debian.org/pipermail/minicom-devel/2008/000904.html

(search for the "^C")

and referring to the documentation:

http://linux.die.net/man/1/runscript

and assuming the above example script runs well, it seemed that the
difference now is that your script has a backslash before the control
character?

Just my guess


On Sat, Jan 12, 2013 at 8:04 PM, laliteshwar yadav wrote:

> Hi Tushar,
> I am facing a problem with the above implementation.
>
> When target is powered on, logs coming started on minicom.
> After 1 second the message is coming as "Executing boot script in 3.000
> seconds - enter ^C to abort".
>
> Here, we need to give cntl+c command to stop the target from auto-boot. We
> want to flash new image into it.
>
> Our task is to automate the process.
>
> I tried with the following code to run through runscript.
>
> set search_string="Executing boot script in 3.000 seconds"
>
> timeout 50
> verbose on
>
> send "\n\r\n\r"
> expect {
> "$search_string" break
> timeout 2 goto abort
> }
>
> abort:
> print \nGiving cntl+c command on minicom
> send "\^C\r"
> send "\^C"
> send \^C\r
> send ^C\r
> expect {
> "RedBoot>" break
> timeout 3 goto panic
> }
> print \n!!Bye Bye runscript!!!
> sleep 2
>
> panic:
> print \n!!Bye Bye Minicom!!!\n
> !  killall -2 minicom
>
>
> What i am observing  is , first this script is running then minicom logs
> start comming. Actually it should be as first some logs should come till
> the search string. Then our command cntl+c should run.
>
> please help me..
>
> Thank you in advance..
>
>
> Regards,
> lalit
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: What is asmlinkage ?

2013-01-11 Thread Peter Teoh

It is defined in include/linux/linkage.h.   And more info here:

http://pix.cs.olemiss.edu/csci523/kernelidioms

Part of it quoted below - ultimately it falls on GCC feature ("
__attribute__((regparm(0)))"):

The CPP_ASMLINKAGE __attribute__((regparm(0))) Macro

asmlinkage macro defines as:
#define CPP_ASMLINKAGE __attribute__((regparm(0))) which defines as:
#define extern "C" __attribute__((regparm(0)))

This is used in the system call interface where C library routines enter
the kernel after setting up their arguments and executing the trap instruction
(INT 80) to enter the kernel. The "asmlinkage" tag really should read "C
language linkage."

GCC takes a i386 specific __attribute__((regparm(0))) that causes
the compiler to pass integer data type arguments in the stack instead of
using regesters.  Functions that take a variable number of arguments will
continue to be passed all of their arguments on the stack.

On Fri, Jan 11, 2013 at 2:56 PM, Rajat Sharma  wrote:

>
> > it is defined even in  much earlier release:
> http://lxr.free-electrons.com/ident?v=2.6.32;i=asmlinkage
>
> There seems to be no definition for arm here too. I literally meant
> definition as '#define asmlinkage' not the usage of it. For arm it is none
> so default defined in include/linux/linkage.h is used which is nothing
> special and just extern 'C' declaration to avoid garbled naming of C++
> linkage, thats it.
>
> -Rajat
>
>
> On Fri, Jan 11, 2013 at 12:00 PM, Peter Teoh wrote:
>
>>
>>
>> On Fri, Jan 11, 2013 at 1:35 PM, Rajat Sharma  wrote:
>>
>>> > asmlinkage is defined for almost all arch:
>>> > grep asmlinkage arch/arm/*/* and u got the answer.
>>>
>>> I didn't see a definition of macro atleast in linux source I was
>>> browsing (3.2.0), Could you please point out to any one you have found.
>>>
>>
>> it is defined even in  much earlier release:
>>
>> http://lxr.free-electrons.com/ident?v=2.6.32;i=asmlinkage
>>
>> for example, and every arch possible has a use of it.
>>
>> --
>> Regards,
>> Peter Teoh
>
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: why are scheduling domains used in multiprocessor systems

2013-01-10 Thread Peter Teoh

On Thu, Jan 10, 2013 at 6:09 PM, Bond  wrote:

> On Thu, Jan 10, 2013 at 9:00 AM, Preeti U Murthy
>  wrote:
> > d1's 'groups',both the sd0s.Here is
> > the next advantage.It needs information about the sched group alone and
> > will not bother about the individual cpus in it.it checks if
> > load(sd0[cpu2,cpu3]) > load(sd0[cpu0,cpu1])
> > Only if this is true does it go on to see if cpu2/3 is more loaded.If
> > there were no scheduler domain or groups,we would have to see the states
> > of cpu2 and cpu3 in two iterations instead of 1 iteration like we are
> > doing now.
>
> Thanks Peter and preeti, I had seen that intel link and had read but
> was not very clear with it,
> with both explanations and new links I am clear.
>

Sorry, I am still learning all these.   On top of scheduling domain, there
is also cpusets, and both are intertwined (for eg, look into sched_fair.c),
and cpu inside cpuset can be offline/online, or made allow/disallowed to be
used.

I know not the difference between cpusets and sched_domain - conceptually.
  Any guidance?

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: What is asmlinkage ?

2013-01-10 Thread Peter Teoh

On Fri, Jan 11, 2013 at 1:35 PM, Rajat Sharma  wrote:

> > asmlinkage is defined for almost all arch:
> > grep asmlinkage arch/arm/*/* and u got the answer.
>
> I didn't see a definition of macro atleast in linux source I was browsing
> (3.2.0), Could you please point out to any one you have found.
>

it is defined even in  much earlier release:

http://lxr.free-electrons.com/ident?v=2.6.32;i=asmlinkage

for example, and every arch possible has a use of it.

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: What is asmlinkage ?

2013-01-10 Thread Peter Teoh

A good example is system call they are passed using registers IIRC
> >>>>
> >>>>
> >>>> --
> >>>> regards,
> >>>>
> >>>> Mulyadi Santosa
> >>>> Freelance Linux trainer and consultant
> >>>>
> >>>> blog: the-hydra.blogspot.com
> >>>> training: mulyaditraining.blogspot.com
> >>>
> >>>
> >>>
> >>> ___
> >>> Kernelnewbies mailing list
> >>> Kernelnewbies@kernelnewbies.org
> >>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
> >>>
> >>
> >> ___
> >> Kernelnewbies mailing list
> >> Kernelnewbies@kernelnewbies.org
> >> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
> >
> >
> >
> > ___
> > Kernelnewbies mailing list
> > Kernelnewbies@kernelnewbies.org
> > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
> >
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: what is the difference between poll and epoll ?

2013-01-10 Thread Peter Teoh

Read this (classic answer):

http://stackoverflow.com/questions/4093185/whats-the-difference-between-epoll-poll-threadpool

and from below:

http://stackoverflow.com/questions/4039832/select-vs-poll-vs-epoll

Which will bring you to:

http://daniel.haxx.se/docs/poll-vs-select.html
and
http://www.kegel.com/c10k.html

and the three are explained in depth here:

http://www.makelinux.net/ldd3/chp-6-sect-3

The difference are also explained here:

http://www.winddisk.com/2012/03/28/epoll%E4%B8%8Eselectpoll%E7%9A%84%E5%8C%BA%E5%88%AB/

and comes with a pictorial diagram as u have requested.

On Tue, Jan 8, 2013 at 4:35 AM, horseriver  wrote:

> hi:
>
>I know epoll is event triger model ,but I do not know internel
>
>surpport for it .
>
>is there some illustration for  epoll's frame or internel
> implementation?
>
> thanks!
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: why are scheduling domains used in multiprocessor systems

2013-01-09 Thread Peter Teoh

On Wed, Jan 9, 2013 at 4:03 PM, Bond  wrote:

> Hi,
> please see this question
>
> http://stackoverflow.com/questions/14229793/what-does-struct-sched-domain-stands-for-in-include-linux-sched-h-scheduling-do
>
> I checked following
> http://lwn.net/Articles/169277/ and following
> http://www.kernel.org/doc/Documentation/scheduler/sched-domains.txt
> the first line of kernel.org doc says
> .  Each CPU has a "base" scheduling domain (struct
> sched_domain)..
> and second para says
> " each scheduling domain spans a number of CPUs (stored in the ->span
> field)."
> third para says
> "  Each scheduling domain must have one or more CPU
> groups..
> The intersection of cpumasks from any two of these groups
> MUST be the empty set."
> then some where in doc it says
> "Balancing within a sched domain occurs between groups. That is, each group
> is treated as one entity." the doc in details talks about the
> implementation of
>
> scheduling domains and mentions that CPUs should belong to one of the
> scheduling domain in a way that
> cpumasks intersection should  be an  empty set
>
> The answer of the question that I want to know is
> why is a scheduling domain actually needed?
>
> _
>
> CPU scheduling involving many configuration and factors.

https://www.cs.unm.edu/~eschulte/classes/cs587/data/10.1.1.59.6385.pdf

Goto page 18 for definition of scheduler domain, and it says:

"Each node in a system has a scheduler domain that points to its parent
scheduler domain. A node might be
a uniprocessor system, an SMP system, or a node within a NUMA system."

this complex hierarchies of CPU is normally associated with hardware
physical proximity CPU (just one factors) or the speed of bus that connect
between CPU.   Not all CPU are connected to all other CPU, but perhaps only
two or 4 other CPU, and therefore, when u transfer data between CPU, it is
necessary to build these proximities information into the kernel, to
minimize costs of data transfer between CPU.

90% (or more) of supercomputers (with thousands of CPU) are run by Linux
kernel, and clearly each CPU can only have a few neighboring CPU.   Other
factors involved power-management:   when your processing usage goes down,
u have to shut down the CPU - leaving only the bare minimum to be running.
  Organizing in some hierarchies facilitate this scheduling algorithm.

http://www.intel.com/technology/itj/2007/v11i4/9-process/6-linux-scheduler.htm
http://www.cs.stonybrook.edu/~porter/courses/cse506/f12/slides/scheduling.pdf
http://www.cs.stonybrook.edu/~porter/courses/cse506/f12/slides/scheduling2.pdf

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Linux Kernel Map

2013-01-09 Thread Peter Teoh

http://www.makelinux.net/kernel_map/

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: /usr/ld Not enough room for program headers

2013-01-09 Thread Peter Teoh

On Wed, Jan 9, 2013 at 6:36 AM, horseriver  wrote:

> On Wed, Jan 09, 2013 at 01:28:12PM +0800, Peter Teoh wrote:
> > On Sun, Jan 6, 2013 at 11:17 AM, horseriver 
> wrote:
> >
> VSYSCALL_BASE = 0xe000;
>
> SECTIONS
> {
>   . = VSYSCALL_BASE ;
>
>   .hash   : { *(.hash) }:text
>   .dynsym : { *(.dynsym) }
>   .dynstr : { *(.dynstr) }
>   .gnu.version: { *(.gnu.version) }
>   .gnu.version_d  : { *(.gnu.version_d) }
>   .gnu.version_r  : { *(.gnu.version_r) }
>


I suspect something wrong with VSYSCALL_BASE + value here.

look at this:

http://marcbug.scc-dc.com/svn/repository/trunk/linuxkernel/linux-2.6.16-mcemu/arch/x86_64/ia32/vsyscall.lds

and doing a diff with your ld script, there is not much diff, except for
the VSYSCALL_BASE + SIZEOF_HEADER

portion.

Read here to understand how SIZEOF_HEADER is calculated:

http://www.math.utah.edu/docs/info/ld_3.html#SEC13

Not sure why do u want to shift the whole section by SIZEOF_HEADER down in
bytes?

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: /usr/ld Not enough room for program headers

2013-01-08 Thread Peter Teoh

On Sun, Jan 6, 2013 at 11:17 AM, horseriver  wrote:

> On Fri, Jan 04, 2013 at 11:34:24AM +0400, Игорь Пашев wrote:
> > 2013/1/4 horseriver 
> > >
> > > Not enough room for program headers
> >
> >
>
>
> > Try to search the Web for this. E. g.:
> > http://lists.gnu.org/archive/html/bug-gnu-utils/2002-08/msg00176.html
>
> thanks!
>
> in my compile option. I have specifiedmy ld-script file ,and there is no
> SIZEOF_HEADER in that file ,
>
>
can u show us your ld script?   according to the msg00176.html above, it is
possible to arise because u have place your other section wrongly (eg,
.text), and nothing to do with SIZEOF_HEADER.


> but where this error come from ?
>
>
>
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: calling system call in arm from user space

2013-01-08 Thread Peter Teoh

On Wed, Dec 26, 2012 at 1:04 PM, Niroj Pokhrel wrote:

> Hi,
> I have written a system call and build it with kernel for Arm
> architecture. However, I'm confused to use it to call it from the user
> space. As it is in x86, where we can simply call by using sycall() function
> and the return value is returned by the syscal() itself.
> In Arm, I tried to write an assembly language program and was able to call
> the system call using the assembly code but what I'm


care to show us how you called system call in assembly in ARM?


> confused is how to call this function using C program. I tried using
> inline assembly but it didn't work. Further, if I can implement it using
> inline assembly then return value will be in r0 and how can I move this
> value to the user variable.
> Thanking you in advance.
>
>
arch/arm/kernel/entry-common.S (and kernel/calls.S) pair up together to
implement the pre-syscall and post-syscall wrapper as you have asked.
perhaps u can try to understand the code first?


> --
> Niroj Pokhrel
> Software Engineer,
> Samsung India Software Operations
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Locating the keyboard driver (and replacing it)

2013-01-08 Thread Peter Teoh

This article gave a very indepth coverage of the keyboard processing in
linux:

http://www.phrack.com/issues.html?issue=59&id=14&mode=txt

http://www.gadgetweb.de/programming/39-how-to-building-your-own-kernel-space-keylogger.html

Not sure about your architecture, but for my Lenovo laptop, when I do a
"cat /dev/input/by-path/platform-i8042-serio-0-event-kbd" and redirect to a
file, every single key input I entered is captured into the the file.

Therefore, looking into the kernel source, we can infer the files
drivers/input/serio/i8042.c are responsible for the keyboard processing.
Of course, this file is compiled into the kernel, not as a kernel module.
So if u want to make any changes, instead of recompile the kernel and
rebooting, one way to do dynamically is called "inline hooking" - look
elsewhere for this method.   It is explained in the following article:

http://www.phrack.com/issues.html?issue=59&id=14&mode=txt

but note the difference between the Phrack's interception and intercepting
the API inside the i8042.c:   when you do a
"cat  /dev/input/by-path/platform-i8042-serio-0-event-kbd" the keyboard
entry is always captured - irregardless of whichever windows/terminal you
are in.   But the Phrack's method is cleaner - it is intercepting at the
tty (eg drivers/tty/n_tty.c:receive_buf() inside the kernel source) level -
so if you switch over to another window, the input got switch away - it is
thus targetted to only that TTY.

And btw, USB keyboard's processing path is altogether different
againanother

http://www.lrr.in.tum.de/Par/arch/usb/download/usbdoc/usbdoc-1.32.pdf

and perhaps u can read here many good writeups:

http://stackoverflow.com/search?q=usb+keyboard+kernel

On Fri1, Dec 14, 2012 at 3:46 PM, manty kuma  wrote:

> Hi,11
>
> I have written a small module that toggles the capslock LED. To
> demonstrate it i want to replace the Existing keyboard module with mine. I
> tried lsmod|grep "key" without any success. also checked /proc/modules. I
> couldnot find any clue regarding the name of the module i need to
> uninstall. So, How can i remove the existing keyboard module and insert
> mine?
>
> Regards,
> Manty
>
>
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>

-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: keyboard driver question

2013-01-05 Thread Peter Teoh

If your keyboard is not USB based, then perhaps article like this is
possibly your answer:

http://www.computer-engineering.org/ps2keyboard/
http://eduunix.ccut.edu.cn/index2/html/linux/Sybex%20Linux%20Power%20Tools%202003/6222final/LiB0023.html
http://freeworld.thc.org/papers/writing-linux-kernel-keylogger.txt

Since your KB is usb-based, u can look here for internal info:

http://www.emntech.com/docs/USB_KeyBoard_Driver_eMNTech.pdf

Inside there is a picture on the overall flow.

Essentially is the usb_kbd_probe() function.   Your problem of
linking/delinking the KB may also be answered by:

http://unix.stackexchange.com/questions/12005/how-to-use-linux-kernel-driver-bind-unbind-interface-for-usb-hid-devices

Another good ref is:

http://www.linux.it/~rubini/docs/usb/usb.html

as it simplified the complex flow of USB processing in the kernel for HID
part in particular.

A good analogy to your problem is the apple keyboard:

http://www.cyberciti.biz/faq/linux-apple-usb-keyboard-driver-installation/

and looking into implementation drivers/hid/hid-apple.c (kernel source)
perhaps can give u some insight.

Another thing is the non-kernel processing of scancode:


http://eduunix.ccut.edu.cn/index2/html/linux/Sybex%20Linux%20Power%20Tools%202003/6222final/LiB0023.html

As describe within, X windows keymap may also be used to change the mapping.

http://www.in-ulm.de/~mascheck/X11/xmodmap.html

http://bochs.sourceforge.net/doc/docbook/user/keymap.html

http://madduck.net/docs/extending-xkb/

http://www.pixelbeat.org/docs/xkeyboard/



On Fri, Jan 4, 2013 at 2:17 AM, Racz Zoli  wrote:

> Hi.
>
> I`m sorry if this isn`t the right place to post my question, but first I
> tried posting it on forum.kernelnewbies.org and nobody answered. Here`s
> my question:
>
>
> I have a Gembird kb-9140l keyboard with some multimedia keys which are not
> working on linux. I thought about writing my own driver for it, so as a
> start, I wrote a small module, which registers an interrupt handler on irq
> 1 with the IRQF_SHARED flag. In the handler function I put a simple printk
> with the scancode read from the keyboard. The problem is, that the handler
> never gets executed. I searched on google, and found that because the
> native driver doesn`t share its interrupt with another modules, before I
> call request_irq I have to free the original interrupt handler from the
> native driver. This would make my computer practically unusable until I
> reboot, but at least I would see, it works, but it doesn`t. The original
> driver works fine after I insert my module, and the interrupt handler still
> doesn`t get called. The weird thing is, when I remove my module, my handler
> executes ones, and the scancode is 0xFE.
>
> The code is the following:
>
> #include 
> #include 
> #include 
> #include 
> #include 
>
>
> MODULE_LICENSE("Dual BSD/GPL");
>
> static int gembirdkb_init(void);
> static void gembirdkb_exit(void);
>
>
> irq_handler_t irq_handler (int irq, void *dev_id, struct pt_regs *regs)
> {
> static unsigned char scancode;
>
> scancode = inb (0x60);
>
> printk("gembirdkb: irq handled... scancode: %d\n",scancode);
>
> return (irq_handler_t) IRQ_HANDLED;
> }
>
>
> static int gembirdkb_init(void)
> {
> int ret;
>
> /* free original interrupt handler */
> // free_irq(1, NULL);
>
> ret = request_irq (1, (irq_handler_t) irq_handler, IRQF_SHARED,
> "gembirdkb", (void *)&irq_handler);
>
> printk("gembirdkb: request_irq result: %d\n", ret);
>
> return ret;
> }
>
> static void gembirdkb_exit(void)
> {
> free_irq(1, (void *)&irq_handler);
> }
>
>
> module_init(gembirdkb_init);
> module_exit(gembirdkb_exit);
>
> Is there any way I can remove the native driver, or I need to recompile
> the kernel without it, and insert mine?
>
> P.s.: Why every topic on the forum is full with questions about mac,
> iphone, samsung galaxy etc.?
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: trace MKFS

2013-01-04 Thread Peter Teoh

when u execute "mkfs", based on your "-t" filesystem passed in to mkfs, one
of the following command line utility will be executed:

mkfs.cramfsmkfs.ext4  mkfs.minix mkfs.reiserfs
mkfs.bfs   mkfs.ext2  mkfs.ext4dev   mkfs.msdos mkfs.vfat
mkfs.btrfs mkfs.ext3  mkfs.jfs   mkfs.ntfs  mkfs.xfs

and for each of the above command line there is a fs utility that include
it.   Look into the source for good understanding.   For ext2/ext3 fs, it
is called e2fsprogs. So in Ubuntu (or Debian-based distro) u do a "apt-get
source e2fsprogs" to get the source:

reading the source of mkfs's main() function:

http://pastebin.com/xcsB6GUC

u can see that after lots of code on setting structures in memory, it start
by writing the inode table etc:

write_inode_tables(fs, lazy_itable_init, itable_zeroed);
create_root_dir(fs);
create_lost_and_found(fs);
reserve_inodes(fs);
create_bad_block_inode(fs, bb_list);

Following through the source code is much more understandable than going
through output of "strace", which records all the interface with the kernel.

Follow through the following slide:

http://www.geego.com/free-linux-lpic-training-material-study-guide/lpic1-modules/4-5/ext2-ext3.html

and forward a few slides and u will understand that mkfs is just making the
header structures on the harddisk to contain the definition of the FS :

Similarly u can find many university courses on filesystem internal, eg:

http://scx010c06a.blogspot.sg/2012/03/second-extended-file-system-ext2.html

Generally, real-life analysis of the harddisk/filesystem is done in
forensic, so if u googling for fs forensics u can find lots of tools that
walk the harddisk for the different components:

http://www.dfrws.org/2007/proceedings/p55-barik.pdf

http://www.cs.kau.se/~stefan/forensics/chapter14-15.pdf

http://www.blackhat.com/presentations/bh-asia-03/bh-asia-03-grugq/bh-asia-03-grugq.pdf

http://www.dfrws.org/2007/proceedings/p55-barik_pres.pdf

and this is forensics of ext4 filesystem:

http://www.dfrws.org/2012/proceedings/DFRWS2012-13.pdf

Understanding "mkfs", is really as good as understanding FS internals.

On Fri, Jan 4, 2013 at 11:12 PM, KASHISH BHATIA <
kashish.bhatia1...@gmail.com> wrote:

> Hi,
>
> I want to trace the overall flow of mkfs inside linux kernel. Specifically
> want to know which
> kernel fs data structures are affected when we run "mkfs" ?
> What all "mkfs" command writes on the block device when we run the
> command?
> Are there any good documents which can explain the same?
>
> --
>
> Regards,
> Kashish Bhatia
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>


-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Why "lsusb" return nothing?

2012-10-03 Thread Peter Teoh

THank you for your help.   This is the result:

mount -t usbdevfs none /proc/bus/usb
mount: mount point /proc/bus/usb does not exist

mkdir /proc/bus/usb
mkdir: cannot create directory `/proc/bus/usb': No such file or directory

And supposed I tried a directory that exist:

mount -t usbdevfs none /proc/bus
mount: unknown filesystem type 'usbdevfs'

The exact mirror (before the problem start I mirrored the system) is still
working today, and I have not find any difference between the two version
so far.

On Mon, Oct 1, 2012 at 12:51 PM,  wrote:

> This steps helped me when I had same problem in SUSE9.
>
> The Reason is "/proc/bus/usb/ doesn't has any entry where actually lsusb
> searches to show USB BUS devices.To make that happen you have to manually .
> Mount the Bus devices using below command.
>
> mount -t usbdevfs none /proc/bus/usb/
>
> And you are done.
> Now lsusb should show all USB BUS devies.
>
> Thanks
> Ashish Bunkar
>
> -Original Message-
> From: kernelnewbies-boun...@kernelnewbies.org [mailto:
> kernelnewbies-boun...@kernelnewbies.org] On Behalf Of Peter Teoh
> Sent: Saturday, September 29, 2012 7:12 AM
> To: kernelnewbies@kernelnewbies.org
> Subject: Why "lsusb" return nothing?
>
> I entered "lsusb" at the command line (as root) and nothing is return, not
> even any error message.
>
> Doing a strace the last few lines are:
>
> open("/dev/bus/usb", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1
> ENOENT (No such file or directory) open("/proc/bus/usb",
> O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or
> directory)
>
> What happened?
>
> This is Ubuntu 10.04 (it used NOT to be like that, not sure I what did
> wrong last time).   But running a VirtualBox INSIDE this same OS, I
> was able to get result from "lsusb" (after enabling the USB devices in
> VirtualBox interface) and strace gives result:
>
> open("/dev/bus/usb/001/002", O_RDWR)= 3
> ioctl(3, USBDEVFS_IOCTL, 0xbff6f75c)= -1 ENOTTY (Inappropriate
> ioctl for device)
> close(3)= 0
> open("/dev/bus/usb/001/001", O_RDWR)= 3
>
> Why the difference?
>
> --
> Regards,
> Peter Teoh
>
> ___
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Regards,
Peter Teoh
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Re: Facing trouble in creating a packet in kernel space

2012-10-01 Thread Peter Teoh

Yes, Michael has a point.   proxy is easier than kernel.I used
Webscarab for this.   Alternatively, another tool I used is scapy (no
proxy setup is needed).   And I must say it is a FANTASTIC tool for
this purpose.   First u capture with wireshark, and then replay via
scapy, which has a function called "fuzz()" for this purpose.

http://www.secdev.org/conf/scapy_pacsec05.handout.pdf

http://media.packetlife.net/media/library/36/scapy.pdf

http://theitgeekchronicles.files.wordpress.com/2012/05/scapyguide1.pdf

it is low level enough, for you to fuzz at different protocol inside
each packet.

On Wed, Sep 26, 2012 at 2:43 AM,
 wrote:
> Hi!
>
> On 16:28 Tue 25 Sep , Rifat Rahman wrote:
>> Hello there,
>>
>> I need to mangle rtp packets in kernel space. So far I am new in kernel
>> module programming. I am trying to implement a module for netfilter hooks.
>> For the first time as exercise, I am trying to write smaller modules. Let
>> me explain what I am actually doing now.
>>
>> I have an echo client and server. The server runs on port 6000. Both are on
>> different machines (May be VMs in bridge filter mode). The client sends udp
>> message and the server just echoes it back. Let us suppose the client sends
>> "some message" as data. Then now I am trying to write a module for the
>> client machine that will append "12345" after the data so that the server
>> will get "some message12345" and echo it back. Now there are various things
>> I did faced. I relied on the NF_IP_POST_ROUTING hook.
>
> I do not understand why you try to do this in the kernel at all. Why does the
> client app not just send "some message12345" itself? If you want to mange the
> data in transit, why not use a transparent proxy instead?
> http://stackoverflow.com/questions/5615579/how-to-get-original-destination-port-of-redirected-udp-message
>
>> At first, I copied the data to a temporary storage, and then add 12345 with
>> that. Then I increase skb->tail using skb_put(). Then I memset()  0 to the
>> packet data, and copy the temporary storage with that. Then as the
>> procedure, NF_ACCEPT is returned. There are certain checking points like
>> the udph->dest == 6000 etc. etc. When I use skb_put(), my system hangs out
>> after two or three minutes.
>
> What does it do exactly? If you do skb_put() and there is no space, you should
> get something like "skb_over_panic".
>
>> When I dmesg to be certain that everything goes
>> right, I find it OK. But, suppose once I send a message like "This is a
>> pretty big message" and another time I send "small message" then I get just
>> "small message12345g message" that means, the bigger message is stored
>> somewhere I don't understand. I tried with skb_add_data() but that works
>> incorrectly here, I understand it's my fault. I just can't figure it out.
>
> Could it be that the small message happened to allocate the same memory the
> previous packet used and thus has some unallocated data at the end?
>
>> Now, one thing came in my mind, if it's not possible, should I create new
>> packets for that data appending? I find skb->end - skb->tail is not so big.
>
> You might have to do so in some cases. But it might have some side effects
> nobody would think about. For example, take a look at this:
> http://lxr.linux.no/#linux+v3.5.4/net/sched/cls_cgroup.c#L117
> It essentially means that the packet queue layer2 accesses data all the way up
> to the socket layer. If you just copy the data, this will break. More things
> like this may exist.
>
> You might be also able to allocate a larger buffer and reuse the sk_buff. It
> might be less painful.
>
>> But ultimately I have to merge two or three packets into one packet and
>> then skb_put() will not suffice for me. Then the point comes, I can use
>> alloc_skb(), skb_reserve(), skb_header_pointer() and other skb manipulation
>> functions, but I don't understand how can I drop the packet got (should I
>> return NF_DROP?)
>
> There should be a way to drop packets inside netfilter rules (maybe not in
> postrouting tough). I did not look into the code right now. Why not try
> returning NF_DROP and see if it leaks?
>
>> and how can I route my created packets in the packet flowing path?
>
> You could do it the dirty way and just call dev_queue_xmit(). The packet will
> be directly sent to the device without going through all hook (including 
> yours)
> a second time. You have to be careful about the udp checksum and 
> fragmentation.
> Also, if ipsec is in use, it will

Re: Why "lsusb" return nothing?

2012-09-29 Thread Peter Teoh

Furthermore, if u look at the /sys/bus/usb interface:

.
./uevent
./devices
./devices/usb1
./devices/1-0:1.0
./devices/usb2
./devices/2-0:1.0
./devices/1-1
./devices/1-1:1.0
./devices/2-1
./devices/2-1:1.0
./devices/1-1.1
./devices/1-1.1:1.0
./devices/1-1.2
./devices/1-1.2:1.0
./devices/1-1.3
./devices/1-1.3:1.0
./devices/1-1.4
./devices/1-1.4:1.0
./devices/1-1.5
./devices/1-1.5:1.0
./devices/1-1.5:1.1
./devices/2-1.2
./devices/2-1.2:1.0
./devices/2-1.3
./devices/2-1.3:1.0
./devices/2-1.6
./devices/2-1.6:1.0
./drivers
./drivers/usbfs
./drivers/usbfs/module
./drivers/usbfs/uevent
./drivers/usbfs/unbind
./drivers/usbfs/bind
./drivers/usbfs/new_id
./drivers/usbfs/remove_id
./drivers/usbfs/1-1.3:1.0
./drivers/hub
./drivers/hub/module
./drivers/hub/uevent
./drivers/hub/unbind
./drivers/hub/bind
./drivers/hub/new_id
./drivers/hub/remove_id
./drivers/hub/1-0:1.0
./drivers/hub/2-0:1.0
./drivers/hub/1-1:1.0
./drivers/hub/2-1:1.0
./drivers/usb
./drivers/usb/uevent
./drivers/usb/unbind
./drivers/usb/bind
./drivers/usb/usb1
./drivers/usb/usb2
./drivers/usb/1-1
./drivers/usb/2-1
./drivers/usb/1-1.1
./drivers/usb/1-1.2
./drivers/usb/1-1.3
./drivers/usb/1-1.4
./drivers/usb/1-1.5
./drivers/usb/2-1.2
./drivers/usb/2-1.3
./drivers/usb/2-1.6
./drivers/usb-storage
./drivers/usb-storage/1-1.1:1.0
./drivers/usb-storage/module
./drivers/usb-storage/uevent
./drivers/usb-storage/unbind
./drivers/usb-storage/bind
./drivers/usb-storage/remove_id
./drivers/usb-storage/2-1.2:1.0
./drivers/usbhid
./drivers/usbhid/1-1.2:1.0
./drivers/usbhid/module
./drivers/usbhid/uevent
./drivers/usbhid/unbind
./drivers/usbhid/bind
./drivers/usbhid/new_id
./drivers/usbhid/remove_id
./drivers/uvcvideo
./drivers/uvcvideo/1-1.5:1.0
./drivers/uvcvideo/1-1.5:1.1
./drivers/uvcvideo/module
./drivers/uvcvideo/uevent
./drivers/uvcvideo/unbind
./drivers/uvcvideo/bind
./drivers/uvcvideo/new_id
./drivers/uvcvideo/remove_id
./drivers_probe
./drivers_autoprobe

this I guessed account for some of the hardware (like USB mass storage
device) still working, whereas those that depend on the /dev/bus/usb
interface is not working (eg, Android's adb)

another symptom is that when i "mkdir -p /dev/bus/usb" directory, by
inserting a new USB harddisk, the directory is immediately deleted.
and now i can access the newly inserted harddisk, and dmesg returns:

[26949.222877] sd 12:0:0:0: [sdc] Mode Sense: 38 00 00 00
[26949.225095] sd 12:0:0:0: [sdc] No Caching mode page present
[26949.225099] sd 12:0:0:0: [sdc] Assuming drive cache: write through
[26949.230715] sd 12:0:0:0: [sdc] No Caching mode page present
[26949.230719] sd 12:0:0:0: [sdc] Assuming drive cache: write through
[26949.282965]  sdc: sdc1 sdc2 sdc4
[26949.288972] sd 12:0:0:0: [sdc] No Caching mode page present
[26949.288977] sd 12:0:0:0: [sdc] Assuming drive cache: write through
[26949.288980] sd 12:0:0:0: [sdc] Attached SCSI disk

seemingly noproblem - but "lsusb" returned NOTHING.

In fact I had also mirror the entire system into another different
hardware - before the /dev/bus/usb non-availability happened, and this
same system is still working fine.

So I am quite sure it is a udev thing, just trying my luck if anyone
know the answer?

On Sat, Sep 29, 2012 at 9:42 AM, Peter Teoh  wrote:
> I entered "lsusb" at the command line (as root) and nothing is return,
> not even any error message.
>
> Doing a strace the last few lines are:
>
> open("/dev/bus/usb", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1
> ENOENT (No such file or directory)
> open("/proc/bus/usb", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1
> ENOENT (No such file or directory)
>
> What happened?
>
> This is Ubuntu 10.04 (it used NOT to be like that, not sure I what did
> wrong last time).   But running a VirtualBox INSIDE this same OS, I
> was able to get result from "lsusb" (after enabling the USB devices in
> VirtualBox interface) and strace gives result:
>
> open("/dev/bus/usb/001/002", O_RDWR)= 3
> ioctl(3, USBDEVFS_IOCTL, 0xbff6f75c)= -1 ENOTTY (Inappropriate
> ioctl for device)
> close(3)        = 0
> open("/dev/bus/usb/001/001", O_RDWR)= 3
>
> Why the difference?
>
> --
> Regards,
> Peter Teoh



-- 
Regards,
Peter Teoh

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

1 2 >

1 - 100 of 183 matches

Mail list logo