Sorteringsscript med tidstagning

Dette forum bruges på EGET ANSVAR til at lege med scripts og andre ting med risiko for at beskadige sit eget og andres systemer.
Brugeravatar
NickyThomassen
Admin
Indlæg: 3650
Tilmeldt: 5. mar 2010, 19:58
IRC nickname: nicky
Geografisk sted: 192.168.20.42

Sorteringsscript med tidstagning

Indlæg af NickyThomassen »

Jeg fik i sommers hjælp til at lave et script, der kan sortere logfilen fra min DNS-server, som jeg kører på den lokale diskstation. Siden har jeg tilpasset det lidt, så både de hjemmesider der slås op, og de ip-adresser der laver forespørgselen, har hver deres sektion i den endelige fil.

Desuden har jeg fået tidstagning til at virke, men opløsningen er desværre kun på 1 sekund, så hvis logfilen er lille (under 100.000 linier), er det ikke altid at den når op på 1 sekund. Det smarte ved tidstagningen er at det kan kaldes flere gange i løbet af scriptet, så mindre dele kan tidstages uafhængigt. Jeg har ikke selv skrevet det, men sakset det fra Linux Journal. Det er lige et nummer for avanceret endnu :)

Her er sciptet

Kode: Vælg alt

#/bin/bash

# Definere timer-funktionen
function timer()
{
    if [[ $# -eq 0 ]]; then
        echo $(date '+%s')
    else
        local  stime=$1
        etime=$(date '+%s')

        if [[ -z "$stime" ]]; then stime=$etime; fi

        dt=$((etime - stime))
        ds=$((dt % 60))
        dm=$(((dt / 60) % 60))
        dh=$((dt / 3600))
        printf '%d:%02d:%02d' $dh $dm $ds
    fi
}

# Starter tidstagning
tmr=$(timer)
clear && echo "Starter op;"

if [ -f /media/dc-do/bind_queries.log ];
then mv /media/dc-do/bind_queries.log /dev/shm/queries
else
echo "Fil ikke fundet i /media/dc-do/, afslutter"
exit 1
fi

# 0 Series, counts lines
lines=`wc -l < /dev/shm/queries`
echo "--- Der er $lines linier i alt i loggen ---" > /home/titanus/Desktop/værter
echo >> /home/titanus/Desktop/værter
unset lines

echo "Fil findes, fortsætter;"

# A Series, covers IPs (top-15)
awk '{ print $5 }' < /dev/shm/queries > /dev/shm/tmpA1
sort < /dev/shm/tmpA1 > /dev/shm/tmpA2
uniq -cd -w 13 < /dev/shm/tmpA2 > /dev/shm/tmpA3
sort -rbfn < /dev/shm/tmpA3 > /dev/shm/tmpA4
echo "--- Viser kun top-15 ip-adresser ---" >> /home/titanus/Desktop/værter
echo >> /home/titanus/Desktop/værter
head -n 15 < /dev/shm/tmpA4 >> /home/titanus/Desktop/værter
echo >> /home/titanus/Desktop/værter
rm /dev/shm/tmpA*
echo "A kørt;"

# B Series, covers pages (top-125)
awk '{ print $7 }' < /dev/shm/queries > /dev/shm/tmpB1
sort -r < /dev/shm/tmpB1 > /dev/shm/tmpB2
uniq -cd < /dev/shm/tmpB2 > /dev/shm/tmpB3
sort -rbfn < /dev/shm/tmpB3 > /dev/shm/tmpB4
echo "--- Viser kun top-125 URLs ---" >> /home/titanus/Desktop/værter
echo >> /home/titanus/Desktop/værter
head -n 125 < /dev/shm/tmpB4 >> /home/titanus/Desktop/værter
rm /dev/shm/queries
rm /dev/shm/tmpB*
echo "B kørt;"

# Afslutter tidstagning. Loppet kan køres flere gange i ét script.
printf 'Kørt på: %s\n' $(timer $tmr)
exit 0
# EOF


Her er et eksempel fra den rå logfil

Kode: Vælg alt

30-Dec-2011 21:24:44.011 info: client 192.168.20.48#53226: query: nyhederne.tv2.dk IN A +
30-Dec-2011 21:24:44.013 info: client 192.168.20.48#49159: query: vejret.tv2.dk IN A +
30-Dec-2011 21:24:44.014 info: client 192.168.20.48#50859: query: finans.tv2.dk IN A +
30-Dec-2011 21:24:44.050 info: client 192.168.20.48#53971: query: sportsresultater.tv2.dk IN A +
30-Dec-2011 21:24:44.050 info: client 192.168.20.48#49616: query: privatliv.tv2.dk IN A +
30-Dec-2011 21:24:44.051 info: client 192.168.20.48#52098: query: manager.tv2.dk IN A +
30-Dec-2011 21:24:44.097 info: client 192.168.20.48#58822: query: www.google.com IN A +
30-Dec-2011 21:24:44.151 info: client 192.168.20.48#55836: query: www.googleadservices.com IN A +
30-Dec-2011 21:24:44.286 info: client 192.168.20.48#59421: query: tv.tv2.dk IN A +
30-Dec-2011 21:24:44.287 info: client 192.168.20.48#60467: query: omtv2.tv2.dk IN A +
30-Dec-2011 21:24:47.250 info: client 192.168.20.48#62955: query: plus.google.com IN A +
30-Dec-2011 21:24:48.716 info: client 192.168.20.45#41349: query: www.amazon.co.uk IN A +
30-Dec-2011 21:24:48.716 info: client 192.168.20.45#41349: query: www.amazon.co.uk IN AAAA +
30-Dec-2011 21:25:04.914 info: client 192.168.20.45#44793: query: www.amazon.co.uk IN A +
30-Dec-2011 21:25:04.915 info: client 192.168.20.45#44793: query: www.amazon.co.uk IN AAAA +
30-Dec-2011 21:25:04.928 info: client 192.168.20.45#37303: query: images-na.ssl-images-amazon.com IN A +
30-Dec-2011 21:25:04.928 info: client 192.168.20.45#37303: query: images-na.ssl-images-amazon.com IN AAAA +
30-Dec-2011 21:25:04.931 info: client 192.168.20.45#37236: query: images-eu.ssl-images-amazon.com IN A +
30-Dec-2011 21:25:04.931 info: client 192.168.20.45#37236: query: images-eu.ssl-images-amazon.com IN AAAA +
30-Dec-2011 21:25:05.382 info: client 192.168.20.42#35305: query: ubuntudanmark.dk IN A +
30-Dec-2011 21:25:05.382 info: client 192.168.20.42#35305: query: ubuntudanmark.dk IN AAAA +
30-Dec-2011 21:25:05.495 info: client 192.168.20.45#43999: query: ocsp.comodoca.com IN A +
30-Dec-2011 21:25:05.496 info: client 192.168.20.45#43999: query: ocsp.comodoca.com IN AAAA +
30-Dec-2011 21:25:06.820 info: client 192.168.20.45#51402: query: armorgames.com IN A +
30-Dec-2011 21:25:06.821 info: client 192.168.20.45#51402: query: armorgames.com IN AAAA +
30-Dec-2011 21:25:06.955 info: client 192.168.20.45#58298: query: www.amazon.co.uk IN A +
30-Dec-2011 21:25:06.955 info: client 192.168.20.45#58298: query: www.amazon.co.uk IN AAAA +
30-Dec-2011 21:25:06.966 info: client 192.168.20.45#33735: query: www.amazon.co.uk IN A +
30-Dec-2011 21:25:06.967 info: client 192.168.20.45#33735: query: www.amazon.co.uk IN AAAA +
30-Dec-2011 21:25:13.613 info: client 192.168.20.42#50972: query: www.bbsyd.dk IN A +
30-Dec-2011 21:25:13.613 info: client 192.168.20.42#50972: query: www.bbsyd.dk IN AAAA +


Og her en forkortet udgave af resultatet

Kode: Vælg alt

--- Der er 19803 linier i alt i loggen ---

--- Viser kun top-15 ip-adresser ---

   9397 192.168.20.38#32770:
   4121 192.168.20.45#32814:
   2986 192.168.20.42#32770:

...

--- Viser kun top-125 URLs ---

    947 mail.google.com
    587 .
    322 www.facebook.com
    298 webhotel17.webhosting.dk
    284 thethinkingatheist.com

...