Problem:- If you want use hive tables with Spack Sql and vice versa. So that case you need to follow steps.
Solution:- Please find the steps.
Step 1- Copy hive-site.xml from /hive/config to /spark/config location.
Step 2- Add below property with hive-site.xml in spark and hive
You can provide localhost or ip of machine.
<property>
<name>hive.metastore.uris</name>
<value>thift://localhost:9083</value>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
Step 2- Add hive path in spark-env.sh.
e.g export HIVE_HOME=$HOME/Software/hive
Step3- Start Hive metastore using below command
bin/hive --service metastore
Step4 - Start pyspark
>>> spark.sql("show databases").show()
+------------+
|databaseName|
+------------+
| default|
| sparkdb|
| testdb|
+------------+
If you are getting below error while starting hive or running spark sql commnd on spark.
Caused by: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
Solution:- Provide any appropriate folder name in /tmp
<property>
<name>system:java.io.tmpdir</name>
<value>/tmp/hive/spark</value>
</property>
<property>
<name>system:user.name</name>
<value>${user.name}</value>
</property>
No comments:
Post a Comment