The Def Guide to Zzap!64

The Zzap Rrap

It is currently Fri Dec 15, 2017 5:41 am

All times are UTC [ DST ]




Post new topic Reply to topic  [ 17 posts ]  Go to page 1, 2  Next
Author Message
 Post subject:
PostPosted: Sat Mar 05, 2005 1:54 pm 
Offline
Techno Teaboy

Joined: Sat Mar 05, 2005 1:40 pm
Posts: 16
Location: Nottingham
*Gets over the shock of joining a C64 forum*

Hi folks,

I will be loading the first few issues of c&vg onto the server this afternoon. There is a page viewer which errmm views the pages.

If you become a member, there is also a pdf download which is currently just the images in one file. I have a project running to do proper pdf's of magazines so they are searchable, but this is a huge task, so if anyone wants to help out, you are more than welcome. Just check the WoS forum under Preservation.

If anyone would like other magazines hosting to preserve bandwidth, drop me a mail and I'll see what I can do. My site is for Sinclair stuff, but I see no reason not to put others on there.

Please be gentle with the downloads (ie don't do them all at once!). Bandwidth is not an issue, but it slows down other peoples downloads.

You can see the site at http://www.sinclair-heaven.net

Its still being developed, so be patient!


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Mar 07, 2005 1:17 am 
Offline
Admin
User avatar

Joined: Tue Jun 17, 2003 6:42 pm
Posts: 2103
Location: Cavan, Ireland
Excuse my French ( ;-) ) but that's some amazing shit! How the hell do you have access to over 25GB of space and no badnwidth issues??!

Regarding, PDF's, I assume at the moment they are just a collection of image files?

To make them searchable you'd have to OCR then, which is easier on the older issues before they started putting mad colourful design behind the text.

It would be great to have all the issues OCRed and searchable but it's a massive job since OCR software hasn't quite got human like AI at the moment :(


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Mar 07, 2005 1:47 am 
Offline
Techno Teaboy

Joined: Sat Mar 05, 2005 1:40 pm
Posts: 16
Location: Nottingham
I have a project running at the moment to OCR all magazines on my site. I'm starting with reviews as it gives people something to aim for. Its pretty boring just ocring pages and it takes so long, people are likely to just give up. So at the moment, they are just pdf's of the scans. As the site is around 90% php scripted it takes a while to do the pdf, then move everything into the correct folders so the scripts work as intended. I tend to do around 10 a day.

I feel sorry for Mort as its even more of a pain to scan them in the first place! At least the OCR software can read his scans.

As for the bandwidth - you might notice my site is a bit slow. I had 53 magazine downloads today, and at about 30mb per magazine, it can slow things down. I am hosting from one of my home PCs, so the only restrictions I have is NTL's upload speed. Its a bit crap, but its free!

I have just installed a 200gb SATA drive, so space is never going to be an issue. This is why I have offered loads of people sections on the site (gambase & sam coupe being two of them). It pads the site out & all I need to do is drop the files in.

Like I said - the site is still being developed & some of the scripts I am developing are also becoming commercial, so I need to concentrate on those first. In particular, my stats package & forum. Just takes ages to write them & test them (thank god for Wos users!)

Lastly - you have a strange grasp of French. It looks English to me. Unless this forum has a babel converter. (Now theres an idea!)


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Mar 07, 2005 2:17 am 
Offline
Admin
User avatar

Joined: Tue Jun 17, 2003 6:42 pm
Posts: 2103
Location: Cavan, Ireland
fogartylee wrote:
At least the OCR software can read his scans.



It can?!! The OCR software I am/was using TextBridge, couldn't make out his scans at 957xwhatever resolution. I found I needed to rescan the pages at around 2000x4000 or something to get a clean run at OCRing them. Very boring job alright! Maybe we could outsource it to India or something? Did they get english copies of zzap there in the 80;s? ;-)


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Mar 07, 2005 8:53 am 
Offline
Techno Teaboy

Joined: Sat Mar 05, 2005 1:40 pm
Posts: 16
Location: Nottingham
I'm using omnipage pro 14, and its great. I need to reformat some pages after, but can't complain. I've seen the same results with textbridge & its not that different. Have you tried saving the output to a word document?
I can't speak for your strange magazines, but the sinclair ones are easy enough.

By the way, I guess there is a conspiracy with commodore & google. The only reason I knew my site had been mentioned here was because I did a search for 'Sinclair Heaven', and the ONLY result with my site name in it was this one!


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Mar 07, 2005 9:55 am 
Offline
King of Ludlow
User avatar

Joined: Thu Jun 19, 2003 10:22 pm
Posts: 1139
Location: Ludlow
fogartylee wrote:

By the way, I guess there is a conspiracy with commodore & google. The only reason I knew my site had been mentioned here was because I did a search for 'Sinclair Heaven', and the ONLY result with my site name in it was this one!


:D That's because Speccies suck, of course, and Commodore rules. :wink:

_________________
Once again I emerge from beneath a massive pile of paper which makes my desk groan to bring you the world’s most amazing posts.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Mar 07, 2005 10:54 am 
Offline
Techno Teaboy

Joined: Sat Mar 05, 2005 1:40 pm
Posts: 16
Location: Nottingham
You could be right.

Hang on a minute - is that the same Mr.Zzapback that became a member of my site at the weekend?

hmmm you guys better beware, I think you have a spy in the camp!

Seriously though, I noticed you had downloaded some mags. Any comments?


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Mar 07, 2005 7:30 pm 
Offline
Admin
User avatar

Joined: Tue Jun 17, 2003 6:42 pm
Posts: 2103
Location: Cavan, Ireland
fogartylee wrote:
I'm using omnipage pro 14, and its great. I need to reformat some pages after, but can't complain. I've seen the same results with textbridge & its not that different. Have you tried saving the output to a word document?
I can't speak for your strange magazines, but the sinclair ones are easy enough.


While Mort's scans are fine for reading with the human eye, they don't have enough detail for the OCRing, the output text comes out with a LOAD of errors. But if I rescan the page in a higher resolution and then OCR, there's a lot less errors. But as regards later issues with lots of crappy multicolour backgrounds...... it's a dead loss.

Do you photoshop the scans before you OCR them?


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Mar 07, 2005 8:12 pm 
Offline
Techno Teaboy

Joined: Sat Mar 05, 2005 1:40 pm
Posts: 16
Location: Nottingham
No. These are 'raw' results from one of Morts scans. I didn't do these by the way, but I have done the omnipage one and got the same results:

http://www.sinclair-heaven.net/crash_omnipage.zip

http://www.sinclair-heaven.net/crash_textbridge.zip

These were both cut n paste jobs into a word document.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Mar 07, 2005 8:26 pm 
Offline
King of Ludlow
User avatar

Joined: Thu Jun 19, 2003 10:22 pm
Posts: 1139
Location: Ludlow
fogartylee wrote:
No. These are 'raw' results from one of Morts scans. I didn't do these by the way, but I have done the omnipage one and got the same results:

http://www.sinclair-heaven.net/crash_omnipage.zip

http://www.sinclair-heaven.net/crash_textbridge.zip

These were both cut n paste jobs into a word document.


That looks quite interesting.
Imagine all Crash & Zzap (etc) text in a database, together with a search engine, now that would be ideal!

_________________
Once again I emerge from beneath a massive pile of paper which makes my desk groan to bring you the world’s most amazing posts.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue Mar 08, 2005 1:11 am 
Offline
Techno Teaboy

Joined: Sat Mar 05, 2005 1:40 pm
Posts: 16
Location: Nottingham
Yep - one of the reasons I wanted reviews doing first was for that very reason. It will then be easier to do the rest of the mags as pdf's and not as boring.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Apr 22, 2005 3:11 am 
Offline
Techno Teaboy

Joined: Fri Apr 22, 2005 2:41 am
Posts: 4
fogartylee wrote:


well I tried the links but they don't work anymore, the result of the link I find quite interesting though. BTW the first "you mum" needs an "r". Good luck with the scanning


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Aug 22, 2005 8:44 am 
Offline
Ken's Fishy Friend

Joined: Mon Aug 22, 2005 2:33 am
Posts: 33
I'm planning on doing some scanning and OCRing in the near future of various 80's and 90's magazines I have at hand (ZZap!64, PCG, CD32, The One, Amiga Format etc.) and was after some general advice on scanning/OCRing please.

Space for me is no issue (large HDD) I'd just like to keep the 'best' quality copies I can as well as have them easily searchable. Any suggestions most welcome re: procedure, settings and software.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Aug 24, 2005 7:58 am 
Offline
Ken's Fishy Friend

Joined: Mon Aug 22, 2005 2:33 am
Posts: 33
hmm, no comments.. oh well.

also Sinclair Heaven seems to have lost it's tracker? There's no seed anymore for any of the torrents.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Aug 24, 2005 12:20 pm 
Offline
Admin
User avatar

Joined: Tue Jun 17, 2003 6:42 pm
Posts: 2103
Location: Cavan, Ireland
Well for OCRing, I use TextBridge Pro. I scan the pages at 300dpi or so to give a horizontal resolution of over 2000 pixels.

Coloured backgrounds or especially changing background can really screw up the OCRing, so sometimes I have to use Paint Shop Pro to colour replace them to just bare white etc.

Then it's time to actually do the OCRing, which by this stage is fairly painless, although it takes a while to format the output text.

It's a very slow, boring process unfortunately, but it's worth it in the end I guess! :)

Feel free to OCR any Zzap stuff for this site! :) Just make sure it hasn't been done already first.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 17 posts ]  Go to page 1, 2  Next

All times are UTC [ DST ]


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group