Featured Posts

Codeigniter Pagination Part 3 OK so we have now covered setting up Codeigniter Pagination and passing some simple data through it. We have also managed to add categories to our posts to make it a little...

Readmore

Codeigniter Pagination Part 2 So leading on from the last screencast we now have some simple data being pushed through the pagination library. Now were going to look at adding categorys to our posts...

Readmore

Codeigniter Pagination Part 1 One of the biggest things I see being asked around the codeigniter forums & IRC channel is pagination. So I have decided to create a set of tutorials from basic setup...

Readmore

Codeigniter Preparation Hopefully if I can keep my promise and will be rolling a set of screencast on this site and part of them will be to do with the Codeigniter Framework. I though it would...

Readmore

Radio Button Replacement With Style I thought I would kick off this blog with a short tutorial on how you can make form radio buttons look and feel allot more interesting. A recent job required the user to...

Readmore

Web Lee Rss

Simple Dom Helper For Codeigniter

Posted on : 18-06-2009 | In : Codeigniter, PHP

19

Screen scraping with PHP Curl has always been a pain in the arse but Simple Html Dom makes the job a walk in the park.

Being that Simple HTML Dom is a class you would normally create a library fot it, but in this case it fits perfect as a helper. let me show you.

class Welcome extends Controller
{

	function __construct()
	{
		parent::Controller();
		$this->load->helper('dom');
	}

	public function index()
	{
		// Grab HTML From the URL
		$html = file_get_html('http://codeigniter.com/');

		// find all link on Codeigniter Site
		foreach($html->find('a') as $e)
    	echo $e->href . '<br>';
	}
}

Produces a list like this:

Screen Scrape Example

You cant get it much easier.

On the simple dom web site they give you good examples on how to use the parser and within the download they also give extended examples.

You can either download files from there site or download the ci dom helper and a copy of their files here .

Happy Scrapping.

Comments (19)

Hey, nice library, will be helpful..
Will be great if it supports xpath selectors..

Well you are in luck it does ie.

// Find all which attribute id=foo
$ret = $html->find(’div[id=foo]‘);

that is really great!! thanks for sharing

nice helpers.. thanks…

Sweet! Hey I love your tutorials, keep up the good work…

They are good enough to be on binarycake dot com not that I ever want to pay for tutorials… but they are great!

-Brad

Thank you Brad. Glad you like them.

Thanks! This will save me hours!

hey,

just found your site and I must say got some cool stuff here. I just wanted to ask how and where do you save the file so it becomes a helper or library in ci since the site does not have much in the docs about working with ci

It would be best to read the CI user guide. Helpers & Libraries have there own folders.

Great article Lee, and great site on a whole – there are some very interesting screencasts.

On another note – it’s refreshing to hear a brit accent contributing to the community. Your accent sounds very local to me, where abouts are you from?

All the best.

Hi George

Well im from South but now living in the North :-) .

Been following your tutorials. Great work!

Thanks for sharing :) it saved my day .

This is awesome — nice share! My question is probably so basic but I’m just getting started… How would I insert this scraped list into my database, with each link inserted into a separate row?

I just came across this which might be found helpful for some: http://www.thephpx.com/2009/10/25/php-simple-html-dom-parser-codeigniter-integration/

Excellent and wonderful sharing. You know your this work saves my plenty of time for working and initializing the HTML DOM Element into my CI library. I was wondering to work with a very large amount of work, but your posts shorten my time.
Thank Man

Regards,
Farhan Islam

Thanks so much!! :X

AMAZING!

Thanks you, guys ! nice tips :D

Write a comment